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SECTION I 


INTRODUCTORY REMARKS 
Paragraph 
Scope of this text. ___ ·-.---------------------------------------------------------------------------------------- 
1 


Mental equipment neceas&ry for cryptanalytic work..----------------------------------------------------------- 
2 


V &lldity of rellult.s of crypt&na.lysia..--------------------------------------------------------------------- 
3 
1. Scope of this text.-a.. It is assumed that the student has studied the two preceding 


texts written by the same author and forming part of this series, viz, Elementary Military Oryp- 
tography, and Adronced MiliJary Oryptography. These texts deal exclusively with cryptography 
as defined therein; that is, with the various types of ciphers and codes, their principles of con- 
struction, and their employment in cryptograpbing and decryptographing messages. Particular 
emphasis was placed upon such means and methods as are practicable for military usage. It is 
also assumed that the student has firmly in mind the technically precise, special nomenclature 
employed in those texts, for the terms and definitions therein will all be used in the present text, 
with essentially the same significances. If this is not the case, it is recommended that the student 
review bis preceding work, in order to regain a familiarity with the specific meanings assigned 
to the terms used therein. There will be no opportunity herein to repeat this information and 
unless he understands clearly the significance of the terms employed, bis progress will be retarded. 
b. This text constitutes the first of a series of texts on cryptanalysis. Although most of the 
information contained herein is applicable to cryptograms of various types and sources, special 
emphasis will be laid upon the principles and methods of solving military cryptograms. Exeept 
for an introductory discussion of fundamental principles underlying the science of cryptanalytfos, 
this first text in the series will deal solely with the principles and methods for the analysis of 
monoalphabetic substitution ciphers. Even with this limitation it will be possible fu discuss 
only a few of the many variations of this one type; but with a firm grasp upon the general prin- 
ciples no difficulties should be experienced with any variations that may be encountered. 
c. This and some of the succeeding texts will deal only with elementary types of cipher 
systems not because they may be encountered in military operations but because their study is 
essential to an understanding of the principles underlying the solution of the modern, very much 
more complex types of ciphers and codes that are likely to be employed by the larger govern- 
ments today in the conduct of their military affairs in time of war. 


d. All of this series of texts will deal only with the solution of visible secret writing. At 
some future date, texts dealing with the solution of invisible secret writing, and with secret 
signalling systems, may be prepared. 


2. l'fental equipment necessary for cryptanalytic work.-a. Captain Parker Hitt, in the 
first United States Army manual 1 dealing with cryptography, opens the first chapter of his 
valuable treatise with the following sentence: 


Success in dealing with unknown ciphers is measured by these four things in the order named: perseverance, 


careful methods or analysis, intuition, luck. 


,, 
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1 Hitt, Ca.pt. Parker, Manual for the Solution of Mi'litary Ciphers. Army Service Schools Press, Fort Leaven- 


worth, Kansas, 1916. 2d Edition, 1918. (Both out of print.) 
(1) 


2 


These words are as true today as they were then. There is no royal road to success in the 
solution of cryptograms. Hitt goes on to say: 


Cipher work will have little permanent attraction for one who expects results at once, without labor, for 
there is a vast amount of purely routine labor in the preparation of frequency tables, the rearrangement of 
ciphers for examination, and the trial and fitting of letter to letter before the message begins to appear. 
The present author deems it advisable to add that the kind of work involved in solving 
cryptograms is not at all similar to that involved in solving "cross-word puzzles", for example. 
The wide vogue the latter have had and continue to have is due to the appeal they make to the 
quite common instinct for mysteries of one sort or another; but in solving a cross-word puzzle 
there is usually no necessity for performing any preliminary labor, and palpable results become 
evident after the first minute or two of attention. This successful start spurs the cross-word 
"addict" on to complete the solution, which rarely requires more than an hour's time. Further- 
more, cross-word puzzles are all alike in basic principle and once understood, there is no more to 
learn. Skill comes largely from the embellishment of one's vocabulary, though, to be sure, con- 
stant practice and exercise of the imagination contribute to the ease and rapidity with which 
solutions are generally reached. In solving cryptograms, however, many principles must be 
learned, for there are many different systems of varying degrees of complexity. Even some of 
the simpler varieties require the preparation of tabulations of one sort or another, which many 
people find irksome; moreover, it is only toward the very close of the solution that results in the 
form of intelligible text become evident. Often, indeed, the student will not even known whether 
he is on the right track until he has performed a large amount of preliminary "spade work" 
involving many hours of labor. Thus, without at least a willingness to pursue a fair amount of 
theoretical study, and a more than average amount of patience and perseverance, little skill and 
experience can be gained in the rather difficult art of cryptanalysis. General Givierge, the author 
of an excellent treatise on cryptanalysis, remarks in this connection: 2 


The cryptanalyst's attitude must be that of William the Silent: No need to hope in order to undertake, nor 
to succeed in order to persevere. 


b. As regards Hitt's reference to careful methods of analysis, before one can be said to be a 
cryptanalyst worthy of the name it is necessary that one should have firstly a sound knowledge 
of the basic principles of cryptanalysis, and secondly, a long, varied, and active practical experi- 
ence in the successful application of those principles. It is not sufficient to have read tre.atises 
on this subject. One month's actual practice in solution is worth a whole year's mere reading 
of theoretical principles. An exceedingly important element of success in solving the more 
intricate ciphers is the possession of the rather unusual mental faculty designated in general 
terms as the power of inductive and deductive reasoning. Probably this is an inherited rather 
than an acquired faculty; the best sort of training for its emergence, if latent in the individual, 
and for its development is the study of the natural sciences, such as chemistry, physics, biology, 
geology, and the like. Other sciences such as linguistics and philology are also excellent. Apti- 
tude in mathematics is quite important, more especially in the solution of ciphers than of codes. 
c. An active imagination, or perhaps what Hitt and other writers call intuition, is essential, 
but mere imagination uncontrolled by a judicious spirit will more often be a hindrance than a 
help. In practical cryptanalysis the imaginative or intuitive faculties must, in other words, be 
guided by good judgment, by practical experience, and by as thorough a knowledge of the general 
situation or extraneous circumstances that led to the sending of the cryptogram as is possible 
to obtain. In this respect the many cryptograms exchanged between correspondents whose 
identities and general affairs, commercial, social, or political, are known are far more readily 


' Givierge, CMn~ral M"1"cel, Cour3 de Cryptographie, Paris, 1925, p. 301. 
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solved than are isolated cryptograms exchanged between unknown correspondents, dealing with 
unknown subjects. It is obvious that in the former case there are good data upon which the 
intuitive powers of the cryptamalyst can be brought to bear, whereas in the latter case· no such 
data are available. Consequently, in the absence of such data, no matter how good the imagina- 
tion and intuition of the cryptanalyst, these powers are of no particular service to him. Some 
writers, however, regard the intuitive spirit as valuable from still another viewpoint, as may be 
noted in the following: 3 


Intuition, like a fl.ash of lightning, lasts only for a second. It generally comes when one is tormented by · 


a difficult decipherment and when one reviews in his mind the fruitless experiments already tried. Suddenly 
the light breaks through and one finds after a few minutes what previous days of labor were unable to reveal. 
This, too, is true, but unfortunately there is no way in which the intuition may be sum- 
tnoned at will, when it is most needed.' There are certain authors who regard as indispensable 
the possession of a somewhat rare, rather mysterious faculty that they designate by the word 
"flair", or by the expression "cipher brains." Even so excellent an authority as General 
Givierge,6 in referring to this mental facility, uses the following words: "Over and above per- 
severance and this aptitude of mind which some authors consider a special gift, and which they 
call intuition, or even, in its highest manifestation, clairvoyance, cryptographic studies will 
continue more and more to demand the qualities of orderliness and memory." Although the 
present author believes a special aptitude for the work is essential to cryptanalytic success, he is 
sure there is nothing mysterious about the matter at all. Special aptitude is prerequisite to 
success in all fields of endeavor. There are, for example, thousands of physicists, hundreds of 
excellent ones, but only a handful of world-wide fame. Should it be said, then, that a physicist 


1 Lange et Soudart, TraiU de Cryptographie, Librairie.Felix Alean, Paris, 1925, p. 104. 
' The following extracts are of interest in this connection: 
· 


The fact that the scientific investigator works 50 per cent of his time by non-rational means is, it seems, quite 
insufficiently recognized. 
There is without the least doubt an instinct fot research, and often the most successful 


investigators of nature are quite unable to give an account of thek reasons for doing such and such an experi- 
ment, or for placing side by side two apparently unrelated facts. 
Again, one of the most salient traits in the 


character of the succeBSful scientific worker is the capacity for knowing that a point is proved when it would not 
appear to be proved to an outside intelligence functioning in a purely rational manner; thus the investigator 
feels that some proposition is true, and proceeds at once to the next set of experiments without waiting and wasting 
time in the elaboration of the formal proof of the point which heavier minds would need. 
Questioriless such a 
scientific intuition may and does sometimes lead investigators astray, but it is quite certain that if they did 
not widely make use of it, they would not get a quarter as far as they do. 
Experiments confirm each other, and a 
false step is usually soon discovered. 
And not only by this partial replacement of reason by intuition does the 
work of science go on, but also to the born scientific worker-and emphatically they cannot be made-the struc- 
ture of the method of research is as it were given, ·he cannot explain it to you, though he may be brought to agree 
a postiori to a formal logical presentation of the way the method works.-Excerpt from Needham, Joseph, 
The Sceptical Biologist, London, 1929, p. 79. 
The e88ence of scientific method, quite simply, is to try to see how data. arrange themselves into causal 


configurations. 
Scientific problems are solved by collecting data. and by "thinking about them all the time." 
We need to look at strange things until, by the appearance of known configurations, they seem familiar, and to 
look at familiar things until we see novel configurations which make them appear strange. We must look at 
events until they become luminous. That is scientific method . . . Insight is the touchstone . , . The appli- 
cation of insight as the touchstone of method enables us to evaluate properly the role of imagination in scientific 
111ethod. 
The scientific process is akin to the artistic process: it is a process of selecting out those elements of 
experience which fit together and recombining them in the mind. 
Much of this kind of research is simply a cease- 


less mulling over, and even the physical scientist has considerable need o( an armchair .. , Our view of scien- 
tific method as a struggle to obtain insight forces the admission that science is half art . . • Insight is the 
unknown quantity which has eluded students of scientific method.-Excerpts from an article entitled lmiqht anq 
Scientific Method, by Willard Waller, in The American Journal of Sociology, VQl, XL, 1934, 


I Op. cit., p. 302. 
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who has achieved very notable success in his field has done so because he is the fortunate posesssor 
of a mysterious faculty? That he is fortunate in possessing a special aptitude for his subject is 
granted, but that there is anything mysterious about it, partaking of the nature of clairvoyance 
(if, indeed, the latter is a reality) is not granted. While the ultimate nature of any mental 
process seems to be as complete a mystery today as it has ever been, the present author would 
like to see the superficial veil of mystery removed from a subject that has been shrouded in 
mystery from even before the Middle Ages down to our own times. 
(The principal and easily 


understandable reason for this is that governments have always closely guarded cryptographic 
secrets and anything so guarded soon becomes "mysterious.") He would, rather, have the 
student approach the subject as he might approach any other science that can stand on its own 
merits with other sciences, because cryptanalytics, like other sciences, has a practical importance 
in human affairs. It presents to the inquiring mind an interest in its own right as a branch of 
knowledge; it, too, holds forth many difficulties and disappointments, and these are all the more 
keenly felt when the nature of these difficulties is not understood by those unfamiliar with the 
special circumstances that very often are the real factors that led to success in other cases. 
Finally, just as in the other sciences wherein many men labor long and earnestly for the true 
satisfaction and pleasure that comes from work well-done, so the mental pleasure that the 
sueeeesful cryptanalyst derives from his accomplishments is very often the only reward for much 
ot the drudgery that he must do in his daily work. General Givierge's words in this connection 
are well worth quoting:• 
· 


Some studies will last for yeii.rs before bearing fruit. In the case of others, cryptanalysts undertaking them 
never get any result. But, for a cryptanalyst who likes the work, the joy of discoveries effaces the memory of his 
hours of doubt and impatience. 
d. With his usual deft touch, Hitt says of the element of luck, as regards the role it plays in 
analysis: 


As to luck, there is the old miners' proverb: "Gold is where you find it." 


The cryptanalyst is lucky when one of the correspondents whose ciphers he is studying 
makes a blunder that gives the necessary clue; or when he finds two cryptograms identical in 
text but in different keys in the same system; or when he finds two cryptograms identical in 
text but in different systems, and so on. The element of luck is there, to be sure, but the crypt- 
analyst mmt be on the alert if he is to profit by these lucky "breaks." 


e. If the present author were asked to state, in view of the progress in the field since 1916, 
what elements might be added to the four ingredients Hitt thought essential to cryptanalytic 
success, he would be inclined to mention the following: 
(1) A broad, general education, embodying interests covering as many fields of practical 


knowledge as possible. This is useful because the cryptanalyst is often called upon to solve 
messages dealing with the most varied of human activities, and the more he knows about these 
activities, the easier his task. 
(2) Access to a large library of current literature, and wide and direct contacts with sources 
of collateral information. These often afford clues as to the contents of specific messages. For 
example, to be able instantly to have at his disposal a newspaper report or a personal report of 
events described or ref erred to in a message under investigation goes a long way toward simpli- 
fying or facilitating solution. Government cryptanalysts are sometimes fortunately situated in 
this respect, especially where various agencies work in harmony. 
(3) Proper coordination of effort. This includes the organization of cryptanalytic personnel 
into harmonious, efficient teams of cooperating individuals. 


• Op. cit., p. 301, 
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(4) Under mental equipment he would also include the faculty of being able to concentrate 


on a problem for rather long periods of time, without distraction, nervous irritability, and 
impatience. The strain under which cryptanalytic studies a.re necessarily conducted is quite 
severe and too long-continued application has the effect of draining nervous energy to an 
unwholesome degree, so that a word or two of caution may not here be out of place. One should 
continue at work only so long as a peaceful, calm spirit prevails, whether the work is fruitful or 
not. ·But just as soon as the mind becomes wearied with the exertion, or just as soon as a feeling 
of hopelessness or mental fatigue intervenes, it is better to stop completely and turn to other 
activities, rest, or play. It is essential to remark that systematization and orderliness of work 
are aids in reducing nervous tension and irritability. On this account it is better to take the 
time to prepare the data carefully, rewrite the text if necessary, and so on, rather than work 
with slipshod, incomplete, or improperly arranged material. 


(5) A retentive memory is an important asset to cryptanalytic skill, especially in the solu- 


tion of codes. The ability to remember individual groups, their approximate locations in other 
messages, the associations they form with other groups, their peculiarities and similarities, saves 
much wear and tear of the mental machinery, as well.as much time in looking up these groups in 
indexes. 
j. It may be advisable to add a word or two at this point to prepare the student to expect 
slight mental jars and tensions which will almost inevitably come to him in the conscientious 
study of this and the subsequent texts. The present author is well a.ware of the complaint of 
students that authors of texts on cryptanalysis base much of their explanation upon their fore- 
knowledge of the "answer"-which the student does not know while he is attempting to follow 
the solution with an unbiased mind. They complain too that these authors use such expressions 
as "obviously", "naturally", "of course", "it is evident that", and so on, when the circumstances 
seem not at all to warrant their use. There is no question but that this sort of treatment is apt 
to discourage the student, especially when the point elucidated becomes clear to him only after 
many hours' labor, whereas, according to the book, the author noted the weak spot at the first 
moment's inspection. The present author can only promise to try to avoid making the steps 
appear to be much more simple than they really are, and to suppress glaring instances of unjusti· 
fia.ble "jumping at conclusions." At the same time he roust indicate that for pedagogical reasons 
in many cases a. message has been consciously "manipulated" so as to allow certain principles to 
become more obvious in the illustrative examples than they ever are in practical work. During 
·the course of some of the explanations attention will even be directed to cases of unjustified 
inferences. Furthermore, of the student who is quick in observation and deduction, the author 
will only ask that he bear in mind that if the elucidation of certain principles seems prolix and 
occupies more space than necessary, this is occasioned by the author's desire to carry the 
explanation forward in very short, easily-comprehended, and plainly-described steps, for the 
benefit of students who are perhaps a bit slower to grasp but who, once they understand, are 
able to retain and apply principles slowly learned just as well, if not better than the students 
who learn more quickly. 
3. Validity of results of cryptanalysis.-Valid, or authentic cryptanalytic solutions cannot 
and do not represent "opinions" of the cryptanalyst. They are valid only so far as they are 
wholly objective, and are susceptible of demonstration and proof, employing authentic, objective 
methods. It should hardly be necessary (but an attitude frequently encountered among laymen 
makes it advisable) to indicate that the validity of the results achieved by any serious crypt- 
analytic studies on authentic material rests upon the same sure foundations and are reached by 
the same general steps as the results achieved by any other scientific studies; viz, observation, 
hypothesis, deduction and induction, and confirmatory experiment. Implied in the latter is the 
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possibility that two or more qualified investigators, each working independently upon.the same 
material, will achieve identical (or practically identical) results. 
Occasion~y a ps.eudo-~rypt­ 
a.nalyst offers "solutions" which cannot withstand such tests; a second, unbiased, mvestiga~r 
working independently either cannot consistently apply the methods alleged to ha~e been applied 
by the pseudo-cryptanalyst, or else, if he can apply them ~t ~ll, the. results (pla~-t~xt transla- 
tions) a.re far different in the two cases. The reason for this is that m such cases it is generally 
found that the "methods" are not clear-cut, straightforward or mathematical in charac.ter. 
Instead,. they often involve the making of judgments on matters t~o tenuou~ to measure'. we~gh, 
or otherwise subject to careful scrutiny. In such cases, the conclusion to which the unpre1udiced 
observer is forced to come is that the alleged "solution" obtained by the first investigator, the 
pseudo-cryptanalyst, is purely subjective. In nearly all c8:8es wh~re this ~s happened (and they 
occur from time to time) there has been uncovered nothing which can ill a'!1y war b~ used to 
impugn the integrity of the pseudo-cryptanalyst. The wors~ that can be s~d of_him is that he 
has become a victim of a special or peculiar form of self-delusion, and that his desire to solve the 
problem, usually in accord with some previously-formed opinion, or notion, has over-balanced, 
or undermined, his judgment and good sense.7 
. 


7 Specific reference can be made to the following typ~cal "case histories": 


Donnelly, Ignatius, The Great Cryptogra.m. 
Chicago, 1888. 


· owen_ Orville W., Sir Frtmeis Bacon's Cipher Story. 
Detroit, 1895. 


'Gallu~, Elizabeth Wells, Francia &COf&'B Bililera.l Cipher. 
Detroit, 1900. 
Maigaliquth, ,D. 8., The Homer of ArWotle. Oxford, 1923. 
. 
. 
Newbold William Romaine, The Cipher of Roger Bacon. 
Philadelphia, 1928. 
(For a scholarly and 


· 
com~lete demolition of Professor Newbold's work, see an article entitled Roger Bacon and 
the Voynich MS, by John M. Manly, in SpeculUm, Vol. VI, No. 3, July 1931.) 
Arensberg, Walter Conrad, The Cryptography of Shakespeare. 
Los Angeles, 1922. 
The Shakespearean Mystery. 
Pittsburgh, 1928. 
The Baconian Keys. 
Pittsburgh, 1928. 
Feely, Joseph Martin, The Shakespearean Cypher. Rochester, N. Y., 1931. 


Deciphering Shakespeare. Rochester, N. Y., 1934. 
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The four basic operations in cryptanalysis-----------------------------------------------------------------------------------·-- 
4 
The determination of the language employed---------------------------------------------·-----------.----------------------·-- 
5 
The determination of the general system-------------------------------·-------------------------------------------------------- 
6 
The reconstruction of the specific keY-------------------------------·--------------·--------------------------------------------- 
7 
The reconstruction of the plain text------------------------------------------------------------------------------------·---------· 
8 
4. The four basic operations in cryptanalysis.-a. The solution of practically every crypto- 
gram involves four fundamental operations or steps: 
(1) The determination of the language employed in the plain-text version. 
(2) The determination of the general system of cryptography employed. 
(3) The reconstruction of the specific key in the case of a cipher system, or the reconstruc- 
tion, partial or complete, of the code book, in the case of a code system; or both, in the case of an 
enciphered code system. 
(4) The reconstruction or establishment of the plain text. 
b. These operations will be taken up in the order in which they are given above and in which 
they usually are performed in the solution of cryptograms, although occasionally the second 
step may precede the first. 
6. The determination of the language employed.--a. There is not much that need be said 
with respect to this operation except that the determination of the language employed seldom 
comes into question in the case of studies made of the cryptograms of an organized enemy. 
By this is meant that during wartime the enemy is of course known, and it follows, therefore, 
that the language he employs in his messages will almost certainly be his native or mother tongue. 
Only occasionally nowadays is this rule broken. Formerly it often happened, or it might have 
indeed been the general rule, that the language used in diplomatic correspondence was not the 
mother tongue, but French. In isolated instances during the World War, the Germans used 
English when their own language could for one reason or another not be employed. For example, 
for a year or two before the entry of the United States into that war, during the time America 
was neutral and the German Government maintained its embassy in Washington, some of the 
messages exchanged between the Foreign Office in Berlin and the Embassy in Washington were 
cryptographed in English, and a copy of the code used was deposited with the Department of 
State and our censor. Another instance is found in the case of certain Hindu conspirators who 
were associated with and partially financed by the German Government in 1915 and 1916; they 
employed English as the language of their cryptographic messages. Occasionally the crypto- 
grams of enemy agents may be in a language different from that of the enemy. But in general 
these are., as has been said, isolated instances; as a rule, the language used in cryptograms ex- 
changed between members of large organizations is the mother tongue of the correspondents. 
Where this is not the case, that is, when cryptograms of unknown origin must be studied, the 
cryptanalyst looks for any indications on the cryptograms themselves which may lead to a 
conclusion as to the language employed. Address, signature, and plain-language words in the 
preamble or in the body of the text all come under careful scrutiny, as well as all extraneous 
circumstances connected with the manner in which the cryptograms were obtained, the person 
on whom they were found, or the locale of their origin and destination. 
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b. In special cases, or under special circumstances a clue to the language employed is found 
in the nature and composition of the cryptographic text itself. For example, if the letters K and 
W are entirely absent or appear very rarely in messages, it may indicate that the language is 
Spanish, for these letters are absent in the alphabet of that language and are used only to spell 
foreign words or names. The presence of accented letters or letters marked with special signs of 
one sort or another, peculiar to certain languages, will sometimes indicate the language used. 
The Japanese Morse telegraph alphabet and the Russian Morse telegraph alphabet contain 
combinations of dots and dashes which are peculiar to those alphabets and thus the interception 
of messages containing these special Morse combinations at once indicates the language involved. 
Finally, there are certain peculiarities of alphabetic languages which, in certain types of crypto- 
grams, viz, pure transposition, give clues as to the language used. For example, the frequent 
digraph C H, in German, leads to the presence, in cryptograms of the type mentioned, of many 
isolated C's and H's; if this is noted, the cryptogram may be assumed to be in German. 
c. In some cases it is perfectly possible to perform certain steps in cryptanalysis before 
the language of the cryptogram has been definitely determined. Frequency studies, for example, 
may be made and analytic processes performed without this knowledge, and by a cryptanalyst 
wholly unfamiliar with the language even if it has been identified, or who knows only enough 
about the language to enable him to recognize valid combinations of letters, syllables, or a few 
common words in that language. He may, after this, call to his assistance a translator who may 
not be a cryptanalyst but who can materially aid in making necessary assumptions based upon 
his special knowledge of the characteristics of the language in question. Thus, cooperation 
between cryptanalyst and translator results in solution.1 
6. The determination of the general system.-a. Except in the case of the more simple 
types of cryptograms, the determination of the general system according to which a given crypto- 
gram has been produced is usually a difficult, if not the most difficult, step in its solution. The 
reason for this is not hard to find. 
b. As will become apparent to the student as he proceeds with his study, in the.final analysi,s, 
the solution of every cryptogram involving a form of substitution depends upon its reduction to mono- 
alphabetic terms, if it is not originally in those terms. This is true not only of ordinary substitution 
ciphers, but also of combined substitution-transposition ciphers, and of enciphered code. If the 
cryptogram must be reduced to monoalphabetic terms, the manner of its accomplishment is 
usually indicated by the cryptogram itself, by external or internal phenomena which become 
apparent to the cryptanalyst as he studies the cryptogram. If this is impossible, or too difficult 
the cryptanalyst must, by one means or another, discover how to accomplish this reduction, 
by bringing to bear all the special or collateral information he can get from all the sources at his 
command. If both these possibilities fail him, there is little left but the long, tedious, and often 
fruitless process of elimination. In the case of transposition ciphers of the more complex type, 
the discovery of the basic method is often simply a matter of long and tedious elimination of 
possibilities. For cryptanalysis has unfortunately not yet attained, and may indeed never 
attain, the precision found today in qualitative analysis in chemistry, for example, where the 
analytic process is absolutely clear cut and exact in its dichotomy. A few words in explanation of 
what is meant may not be amiss. When a chemist seeks to determine the identity of an unknown 


1 The writer has seen in print statements that "during the World War ... decoded messages in Japanese 
and Russian without knowing a word of either language." The extent to which such statements are exaggerated 
will soon become obvious to the student. Of course, there are occasional instances in which a mere clerk with 
quite limited experience may be able to "solve" a message in an extremely simple system in a language of which 
he has no knowledge at all; but such a "solution" calls for nothing more arduous than the ability to recognize 
pronounceable combinations of vowels and coneonants--an ability that hardly deserves to be rated as "crypt- 
analytic" in any real sense. 
To say that it is possible to solve a cryptogram in a foreign language "without 


knowing a word of that language" is not quite the same as to say that it is possible to do so with only a slight 
knowledge of the language; and it may be stated without cavil that the better the cryptanalyst's knowledge of 
the language, the greater are the chances for his succese and, in any case, the easier is hie work. 
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substance, he applies certain specific reagentB to the substance and in a specific sequence. The 
first reagent tells him definitely into which of two primary classes the unknown substance falls. 
He then applies a second test with another specific reagent, which tells him again quite definitely 
into which of two secondary classes the unknown substance falls, and so on, until finally he has 
reduced the unknown substance to its simplest terms and haa found out what it is. In striking 
contrast to this situation, cryptanalysis affords exceedingly few "reagents" or tests that may be 
applied to determine positively that a given cipher belongs to one or the other of two systems 
yielding externally similar results. And this is what makes the analysis of an isolated, complex 
cryptogram so difficult. Note the limiting adjective "isolated" in the foregoing sentence, for it 
is used advisedly. It is not often that the general system fails to disclose it.self or cannot he 
discovered by painstaking investigation when there is a great volume of text accumulating from 
a regular traffic between numerous correspondents in a large organization. Sooner or later the 
system becomes known, either because of blunders and carelessness on the part of the personnel 
entrusted with the cryptographing of the messages, or because the accumulation of text itself 
makes possible the determination of the general system by cryptanalytic studies. But in the 
caae of a single or even a few isolated cryptograms concerning which little or no information can 
be gained by the cryptanalyst, he is often unable, without a knowledge of, or a shrewd guess as to 
the general system employed, to decompose the heterogeneous text of the cryptogram into 
homogeneous, monoalphabetic text, which is the ultimate and essential step in analysis. The 
only knowledge that the cryptanalyst can bring to his aid in this most difficult step is that gained 
by long experience and practice in the analysis of many different types of systems. 
c. On account of the complexities surrounding this particular phase of cryptanalysis, and 
because in any scheme of analysis baaed upon successive eliminations of alternatives the crypt- 
analyst can only progress so far as the extent of his own knowledge of all the possible alternatives 
will permit, it is necessary that detailed discussion of the eliminative process be postponed until 
the student has covered most of the field. For example, the student will perhaps want to know 
at once how he can distinguish between a cryptogram that is in code or enciphered code from one 
that is in cipher. It is at this stage of his studies impracticable to give him any helpful indica- 
tions on his question. In return it may be asked of him why he should expect to be able to do 
this in the early stages of his studies when often the experienced expert cryptanalyst is baftled on 
the same score I 


d. Nevertheless, in lieu of more precise testB not yet discovered, a general guide that may he 
useful in cryptanalysis will be built up, step by step as the student progresses, in the form of a 
series of charts comprising what may be designated An Analytical Key For Oryptanalysi,s. 
(See 
Par. 50.) It may be of assistance to the student if, as he proceeds, he will carefully study the 
charts and note the place which the particular cipher he is solving occupies in the general crypt- 
analytic panorama.. These charts admittedly constitute only very brief outlines, and can 
therefore be of but little direct assistance to him in the analysis of the more complex types of 
ciphers he may encounter later on. So far as they go, however, they may be found to be quite 
useful in the study of elementary cryptanalysis. For the experienced cryptanalyst they can 
serve only as a means of assuring that no possible step or process is inadvertently overlooked in 
attempts to solve a difficult cipher. 


e. Much of the labor involved in cryptanalytic work, as referred to in Par. 2, is connected 
with this determination of the general system. The preparation of the text, itB rewriting in 
different forms, sometimes being rewritten in a half dozen ways, the recording of letters, the 
establishment of frequencies of occurrences of letters, comparisons and experiments made with 
known material of similar character, and so on, constitute much labor that is most often in- 
dispensable, but which sometimes turns out to have been wholly unnecessary, or in vain. In a 
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recent treatise 2 it is stated quite boldly that "this work once done, the determination of the 
system is often relatively easy." This statement can certainly apply only to the simpler types of 
ciphers; it is entirely misleading as regards the much more frequently encountered complex 
cryptograms of modern times. 


T. The reconstruction of the specific key.-a. Nearly all practical cryptographic methods 
require the use of a specific key to guide, control, or modify the various steps under the general 
system. Once the latter has been disclosed, discovered, or has otherwise come into the possession 
of the cryptanalyst, the next step in solution is to determine, if necessary, and if possible, the 
specific key that was employed to cryptograph the message or messages under examination. 
This determination may not be in complete detail; it may go only so far as to lead to a knowledge 
of the number of alphabets involved in a substitution cipher, or the number of columns involved 
in a. transposition cipher, or that a one-pa.rt code has been used, in the case of a code system. 
But it is often desirable to determine the specific key in as complete a form and with as much 
detail as possible, for this information will very frequently be l!seful in the solution of subsequent 
cryptograms exchanged between the same correspondents, since the nature of the specific key 
hi a solved case may be expected to give clues to the specific key in an unsolved ·case. 
' 
b. Frequently, however, the reconstruction of the key is not a prerequisite to, and does not 


ooristitute an absolutely necessary preliminary step in, the fourth basic operation, viz, the recon- 
stii,lction .()r est1tb!Munent of the plain text. In many cases, indeed, the two processes a.re 
carried a.long aimultaneously, the one assisting the other, until in the final stages both have been 
completed in their entireties. In still other cases the reconstruction of the specific key may 
succeed instead of precede the reconstruction of the plain text, and is accomplished purely as a 
matter of academic interest; or the specific key may, in unusual cases, never be reconstructed. 
8. The reconstruction of the plain tert.-a. Little need be said at this point on this phase 


of cryptanalysis. The process usually consists, in the case of substitution ciphers, in the estab- 
lishment of equivalency between specific letters of the cipher text and the plain text, letter by 
letter, pair by pair, and so on, depending upon the particular type of substitution system 
involved. In the case of transposition ciphers, the process consists in rearranging the elements of 
the 'cipher text, letter by letter, pair by pair, or occasionally word by word, depending upon the 
partictilar type of transposition system involved, until the letters or words have been returned 
to their original plain-text order. In the case of code, the process consists in determining the 
meaning of each code group and inserting this meaning in the code text to reestablish the original 
plain text. 


b~ The foregoing processes do not, as a rule, begin at the beginning of a message and 


continue letter by letter, or group by group in sequence up to the very end of the message. The 
establishment of values of cipher letters in substitution methods, or of the positions to which 
cipher letters should be transferred to form the plain text in the case of transposition methods, 
comes at very irregular intervals in the process. At first only one or two values scattered here 
and there throughout the text may appear; these then form the "skeletons" of words, upon which 
further work, by a continuation of the reconstruction process, is made possible; in the end the 
complete or nearly complete 8 text is established. 
c. In the case of cryptograms in a foreign language, the translation of the solved messages 
is a final and necessary step, but is not to be considered as a cryptanalytic process. However, 
it is commonly the case that the translation process will be carried on simultaneously with the 
cryptanalytic, and will aid the latter, especially when there are lacunae which may be filled in 
from the context. 
(See also Par. 5c in this connection.) 


a Lange et Soudart, op. cit., p. 106. 
' Sometimes in the case of code, the meaning of a few code groups may be lacking, because there is insufficient 
text to establish their meaning, 


SECTION III 


FREQUENCY DISTRIBUTIONS 


The simple or uniliteral frequency distribution. 
Paragraph 


Important features of the normal uniliteral freq:~;;~;-dhtrlb~tki~:::::::::::::::::::::::::::::::::::::::::::::::::::: 
1~ 


Constancy of the standard or normal uniliteral frequency distribution.. __________________________________ ·------------- 
11 


. 9. The sbnple or uniliteral frequency distribution.-a. It has long been known to cryptog- 
raphers a~d typographers that the letters composing the words of any intelligible written text 


compose~ m _any language ~hich is alphab~tic in construction are employed with greatly vaning 


fre~~enc1es. For ex~~le, ~ o~ cross-sect10n paper a simple tabulation, shown in Fig. 1, called a 
unuiteral frequ~nc_y d~tribution, 18 ~ade ~f .the letters composing the words of the preceding sen- 
tence, the vanat1on m frequency is strikingly demonstrated. It is seen that whereas certain 
letters, such as A, E, I, N, O, R, S, and T, are employed very frequently, other letters, such as 
C, G, P, and Ware employed not nearly so frequently, while still other letters such as F J Q v 
all:d Z are employed either seldom or not at all. 
' 
' ' ' ' 


(Total=200 letters) 


FIGUB.11. 


b. If. a similar ta~ula.tion is now m.ade of the letters comprising the words of the second 


·sentence m the preceding paragraph, the graph shown in Fig. 2 is obtained. Both sentences 
have exactly the same number of letters (200). 
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(Total=200 letters) 


FIGUB.12. 


. c. ~hm.~gh each of these two graphs exhibits great variation in the relative frequencies 


with. whi~h different lette~ ~re employed in the respective sentences to which they apply, no 
marked differences are exhibited between the frequencies of the same letter in the two graphs. 


~o~pare, for .example, the frequencies of A, B, C . . . Z in Fig. 1 with those of A, B, C, . • . z 
m F1g. ~ .. ~Side from one or two exceptions, as in the case of the letter F, these two graphs agree 
rather stnkingly. 
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d. This agreement, or similarity, would be practically complete if the two texts were much 
longer, for example, five times as long. In fact, when two texts of similar character, each con- 
taining more than 1,000 letters, are compared, it would be found that the respective frequencies 
of the 26 letters composing the two graphs show only vecy slight differences. This means, in 
other words, that in normal text each letter of the alphabet occurs with a rather constant or 
charaderiatw freg:uency which it tends to approximate, depending upon the length of the text 
analyzed. The longer the text (within certain limits), the closer will be the approximation.1 


e. An experiment along these lines will be convincing. A series of 260 official telegrams 1 
passing through the War Department Message Center was examined statistically. The mes- 
sages were divided into five sets, each tote.ling 10,000 letters, and the five diatributions shown 
in Table 1-A, were obtained. 


j. H the five distributions in Table 1-A are summed, the results are as shown in Table 2-A. 


TABLE 1-A.-Ab8ol'l.4e freg:tU'nCiu of letter8 appearing in five 8ets of 00fJef'nmemal, plain-Im tele- 


gram8, 64Ch 8ef containing 10,000 Zttter8, arranged olphabetWally 


MaaaeNo.1 
M.ageNo.2 
MmaceNo.8 
M-.eNo.4 
M-.eNo.6 


Abaolnte 
Ablcilute 
AbBolnte 
Abeolnte 
.A.bBolute 
Letter 
l'nqueDeJ 
Letter 
l'nlq1leDclJ 
Letter 
Frequency 
Letter 
J'req1WIC)' 
Letter 
:rr.. 
QWIBCIJ 


A.. 
738 
A 
783 
A 
681 
A.. __ 740 
.A__ __ 
741 


B_ 
104 
B_ 
103 
B _____ 
98 
B 
83 
B 
99 


c_ 
319 
c 
300 
c ____ 
288 
c 
326 
c 
301 
D _____ 
387 
D 
'13 
D 
423 
D 
401 
D 
448 
E_ ____ 1, 367 
E.. ___ 1, 294 
E 
1,292 
E 
1, 270 
E 
1, 275 


F---- 
253 
F _____ 
287 
F 
308 
F. ___ 
287 
F 
281 
G ____ 
166 
G _____ 
175 
G ____ 
161 
G 
167 
G ____ 
150 
ff. ________ 
310 
H._ ___ 
351 
H 
. 
335 
H... ______ 
349 
H 
. 
349 
! ________ 
742 
I 
750 
I _____ 
787 
I. 
- 
700 
I ___ 
697 
J 
18 
J __ 
17 
J _____ 
10 
J 
-· 
21 
J 
16 
K 
36 
K. 
38 
K 
- 
22 
K 
21 
K 
31 
L 
365 
L.. 
393 
L 
333 
L 
386 
L 
344 
II 
242 
.._ ___ 
240 
II 
238 
II 
249 
II 
268 
N___ ______ 
786 
N ______ 
794 
N ____ 
815 
N 
800 
N 
780 
o _________ 
685 
o _________ 
770 
o _______ 
791 
o _____ 
756 
0 
762 
p ________ 
241 
p ______ 
272 
p ________ 
317 
p _____ 
245 
p ____ 
260 


Q. 
40 
Q 
22 
Q 
. 
45 
Q._ __ 
38 
q 
30 
R 
760 
R_ _____ 
745 
R._ ___ 
762 
R.. 
735 
R_ 
786 
s 
---- 
658 
s _______ 
588 
s. 
585 
s _____ 
628 
s _______ 
604 


T 
936 
T ____ 
879 
T ____ 
894 
T ____ 
958 
T _____ 
928 
U. 
. 
270 
u _______ 
233 
u __________ 
312 
u ______ 
247 
u _____ 
238 
v _____ 
163 
v._, ---- 
173 
v _____________ 
142 
v __________ 
133 
v ___ 
155 


•---··-· 
166 
w 
.. 
163 "-----··· 
136 ·····------ 
133 '--- 
182 
x__ ____ 
43 
x_ _____ 
50 
x_ ___________ 
44 
x _________ 
53 
x_ ______ 
41 
y _____ 
191 
y ________ 
155 
y ___________ 
179 
y _______________ 
213 
y ________ 
229 
z.. 
14 
z_ ______ 
17 
z _______ 
2 z _______ 
11 
z ________ 
5 
-- 
Total. __ 10,000 --·--.. ------· 10,000 -----·------------ 10,000 -----.. ------ 10,000 -------- 10,000 


1 See footnote 5, page 16. 


1 These comprised messages from several departments in addition to the War Department and were all of 


an administrative character. 
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TABLE 2-A.-Absolute frequencies of letter8 appearing in the combined five sets of messages totaling 


50,000 letters, arranged alphabetwolly 


A ..... _ 3, 683 
G...... 
819 
L. _____ 1 ,821 
Q______ 
175 
B...... 
487 
H_ _____ 1, 694 
M ...... 1 ,237 
R. ..... 3, 788 


C ______ 1, 534 
I ______ 3, 676 
N ...... 3 ,975 
S ... ___ 3, 058 
o ______ 2, 122 
J______ 
82 
o ______ 3, 764 
T. _____ 4, 595 
E ...... 6, 498 
K______ 
148 
p ______ 1 ,335 
U ______ 1, 300 


F______ 1, 416 


V .••••• 
w 
_____ _ 
X ..... . 
y _____ _ 
z _____ _ 


766 
780 
231 
967 
49 


g. The frequencies noted in subparagraph}, when reduced to the basis of 1 000 letters and 


~hen us~d. as a basis for.constructing~ simple chart that will exhibit the variati~ns in frequency 
m a stnkmg manner, yield the followmg graph which is hereafter designated as the normal or 
standard uniliteraljrequency distribution for English telegraphic plain text: 
' 
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FIGURE 3. 


10: I~port~nt fea~ur~s of th.e normal uniliteral frequency distribution.-a. When the graph 
shown m Fig. 3 is studied m detail, the following features are apparent: 
.(1) It is qu~te irreg_ular in ~ppearance. .This is because the letters are used with greatly 


vacy1;11g frequenc~es, as discussed m the preceding paragraph. This irregular appearance is often 
descnbed by saying that the graph shows marked crests and troughs, that is, points of high fre- 
quency and low frequency. 
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(2) The relative positions in which the crests and troughs fall within the graph, that is, the 


spatWl relations of the crests and troughs, are rather definitely fixed and are determined by cir- 
cumstances which have been explained in a preceding text.3 
· 


(3) The relative heights and depths of the crests and troughs within the graph, that is, the 
linear extensions of the lines marking the respective frequencies, are also rather definitely fixed, 
as would be found if an equal volume of similar text were analyzed. 
(4) The most prominent crests are marked by the vowels A, E, I, O, and the consonants 
N, R, S, T; the most prominent troughs are marked by the consonants J, K, Q, X, and z. 
(5) The important data are summarized in tabular form in Table 3. 


TABLE 3 


6 Vowels: A E I 0 U Y--------------------------------------------------------- 
20 Consonants: 
5 High Frequency (D N R S T)--------------------~------------------ 
10 Medium Frequency (B C F G H L M P V W) ______________ _ 
5 Low Frequency (J K Q X Z) ______________________________________ _ 


TotaL ______________________________________________________________________ _ 


(6) The frequencies of the letters of the alphabet are as follows: 


A __________ 
74 
G __________ 
16 
L __________ 
36 
Q_ _________ 
B __________ 
10 
H __________ 
34 
M __________ 
25 
R_ _________ 
c __________ 
31 
! __________ 
74 
N __________ 
79 
$ __________ 
o __________ 
42 
J ---------- 
2 
o __________ 
75 
T __________ 
E __________ 130 
K __________ 
3 
p __________ 
27 
u __________ 
F __________ 
28 


(7) The relative order of frequency of the letters is as follows: 


E __________ 130 
! __________ 
74 
c __________ 
31 
y __________ 
T __________ 
92 
$ __________ 
61 
F---~------ 
28 
G __________ 
. 
N __________ 
79 


~o __________ 
42 
p ---------- 
27 
w 
__________ 
R __________ 
76 
L __________ 
36 
u __________ 
26 
v __________ 


o __________ 
75 
H_ _________ 
34 
M __________ 
25 
• B __________ 
A __________ 
74 
-- 


Frequency 


398 


350 
238 
14 


1,000 


3 
76 
61 
92 
26 


19 
16 
16 
15 
10 


Percent of 
Percent of total in 
total 
round 
numbers 


39.8 
40 


35.0 
35 
23.8 
24 
1. 4 
1 


100.0 
100 


v __________ 
15 
w 
__________ 
16 
x __________ 
5 
y __________ 
19 
z __________ 
1 


x __________ 
5 
Q __________ 
3 
K __________ 
3 


_J __________ 
2 
z __________ 
1 


(8) The four vowels A, E, I, 0 (combined frequency 353) and the four consonants N, R, S, T 
(combined frequency 308) form 661 out of every 1,000 letters of plain text; in other words, less 


than~' of the alphabet is employed in writing% of normal plain text. 


1 Section VII, Elementary Military Cryptography. 
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b. The data given in Fig. 3 and Table 3 represent the relative frequencies found in a large 
volume of English telegraphic text of a governmental, administrative character. These fre- 
quencies will vary somewhat with the nature of the text analyzed. For example, if an equal 
number of telegrams dealing solely with commercial transactions in the leather industry were 
studied statistically, the frequencies would be slightly different because of the repeated occurrence 
of words peculiar to that industry. Again, if an equal number of telegrams dealing solely with 
military messages of a tactical character were studied statistically, the frequencies would differ 
slightly from those found above for general governmental messages of an administrative character. 
c. If ordinary English literary text (such as may be found in any book, newspaper, or printed 
document) were analyzed, the frequencies of certain letters would be changed to an appreciable 
degree. 
This is because in telegraphic text words which are not strictly essential for intelligibility 
(such as the definite and indefinite articles, certain prepositions, conjunctions and pronouns) are 
omitted. In addition, certain essential words, such as "stop", "period", "comma", and the like, 
which are usually indicated in written or printed matter by symbols not easy to transmit tele- 
graphically and which must, therefore, be spelled out in telegrams, occur very frequently. Fur- 
thermore, telegraphic text often employs longer and more uncommon words than does ordinary 
newspaper or book text. 
d. As a matter of fact, other tables compiled in the Office of the Chief Signal Officer gave 


slightly different results, depending upon the source of the text. For example, three tables based 
upon 75,000, 100,000, and 136,257 letters taken from various sources (telegrams, newspapers, 
magazine articles, books of fiction) gave as the relative order of frequency for the first 10 letters 
the following: 
For 75,000 letters ______________________ E T R N I 0 A S D L 
For 100,000 letters ____________________ E T R I N 0 A S D L 
For 136,257 letters ______________________ E T R N A 0 I S L D 


TABLE 4.-Frequency table for 10,000 letters of literary English, as compiled by Hitt 


ALPHABETICALLY ARRANGED 
A ______ 
778 
G __________ 174 
L __________ 372 
Q_ _________ 
8 
v _________ 112 
B ______ 
141 
H __________ 595 
M_ _________ 288 
R_ _________ 651 
w 
__________ 176 
c ______ 
296 
!_, ________ 667 
N __________ 686 
$ __________ 622 
x_ _________ 
27 
o ______ 
402 
J __________ 
51 
Q _____ " ____ 807 
T __________ 855 
y _________ 196 
E ______ 1 ,277 
K __________ 
74 
p __________ 223 
u __________ 308 
z __________ 
17 
F ______ 
197 
ARRANGED ACCORDING TO FREQUENCY 


E ______ 1 ,277 
R __________ 651 
u __________ 308 
y __________ 196 
K_ _________ 
74 
T ______ 
855 
$ __________ 622 
c __________ 296 
w 
__________ 176 
J __________ 
51 
o ______ 
807 
H_ _________ 595 
M __________ 288 
G __________ 174 
x_ _________ 
27 
A ______ 
778 
o __________ 402 
p __________ 223 
B __________ 141 
z.. _________ 
17 
N ______ 
686 
L __________ 372 
F __________ 197 
v __________ 112 
Q_ _________ 
8 
! ______ 
667 


!!! 
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Hitt also compiled data for telegraphic text (but does not state what kind of messages) and 
gives the following table: 


TABLE 5.-Frequency table for 10,000 letters of telegraphic English, as compiled by Hitt 


ALPHABETICALLY ARRANGED 
A ______ 
813 
G __________ 201 
L __________ 392 
Q_ _________ 
38 
v __________ 136 
B ______ 
149 
H __________ 386 
M __________ 273 
R __________ 677 
w 
__________ 166 
c ______ 
306 
! __________ 711 
N __________ 718 
s __________ 656 
x __________ 
51 
D ______ 
417 
J ___ c ______ 
42 
o __________ 844 
T __________ 634 
y __________ 208 


E ______ 1 ,319 
K __________ 
88 
p __________ 243 
u __________ 321 
z __________ 
6 
F ______ 
205 


ARRANGED ACCORDING TO FREQUENCY 


E ______ 1 ,319 
s __________ 656 
u __________ 321 
F __________ 205 
K __________ 
88 
o ______ 
844 
T __________ 634 
c __________ 306 
G __________ 201 
x __________ 
51 
A ______ 
813 
D __________ 417 
M __________ 273 
w 
__________ 166 
J __________ 
42 
N ______ 
718 
L __________ 392 
p __________ 243 
B __________ 149 
Q __________ 
38 
I ______ 
711 
H __________ 386 
y __________ 208 
v __________ 136 
z __________ 
6 
R _____ 
677 


e. Frequency data applicable purely to English military text were compiled by Hitt,4 from 
a study of 10,000 letters taken from orders and reports. The frequencies found by him are given 
in Tables 4 and 5. 
11. Constancy of the standard or normal, uniliteral frequency distribution.-a. The 
relative frequencies disclosed by the statistical study of large volumes of text may be considered 
to be the standard or normal frequencies of the letters of written English. Counts made of 
smaller volumes of text will tend to approximate these normal frequencies, and, within certain 
limits,5 the smaller the volume, the lower will be the degree of approximation to the normal, 
until, in the case of a very short message, the normal proportions may not obtain at all. It is 
advisable that the student fix this fact firmly in mind, for the sooner he realizes the true nature 
of any data relative to the frequency of occurrence of letters in text, the less often will his labors 
toward the solution of specific ciphers be thwarted and retarded by too strict an adherence to 
these generalized principles of frequency. He should constantly bear in mind that such data 
are merely statistical generalizations, that they will be found to hold strictly true only in large 
volumes of text, and that they may not even be approximated in short messages. 
b. Nevertheless the normal frequency distribution or the "normal expectancy" for any 
alphabetic language is, in the last analysis, the best guide to, and the usual basis for, the solution 
of cryptograms of a certain type. It is useful, therefore, to reduce the normal, uniliteral 
frequency distribution to a basis that more or less closely approximates the volume of text which 
the cryptanalyst most often encounters in individual cryptograms. As regards length of mes- 
sages, counting only the letters in the body, and excluding address and signature, a study of the 


• Op. cit., pp. 6-7. 
s It is useless to go beyond a certain limit in establishing the normal-frequency distribution for a given 
language. 
As a striking instance of this fact, witness the frequency study made by an indefatigable German, 
Kaeding, who in 1898 made a count of the letters in about 11,000,000 words, totaling about 62,000,000 letters in 
German text. 
When reduced to a percentage basis, and when the relative order of frequency was determined, 
the results he obtained differed very little from the results obtained by Kasiski, a German cryptographer, from a 
count of only 1,060 letters. 
See Kaeding, Haeufigkeitswoerterbuch, Steglitz, 1898; Kasiski, Die Geheimschrijten 
und die Dechiffrir-Kunst, Berlin, 1863. 
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260 telegrams referred to in paragraph 9 shows that the arithmetical average is 217 letters; 
the statistical mean, or weighted average,' however, is 191 letters. These two results are, 
however, close enough together to warrant the statement that the average length of telegrams 
is approximately 200 letters. The frequencies given in Par. 9f have therefore been reduced to 
a basis of 200 letters, and the following uniliteral frequency distribution may be taken as showing 
the most typical distribution to be expected in 200 letters of telegraphic English text: 


FIGURE 4. 


c. The student should take careful note of the appearance of the distribution 7 shown in 
Fig. 4, for it will be of much assistance to him in the early stages of his study. The manner of 
setting down the tallies should be followed by him in making his own distributions, indicating 
every fifth occurrence of a letter by an oblique tally. This procedure almost automatically 
shows the total number of occurrences for each letter, and yet does not destroy the graphical 
appearance of the distribution, especially if care is taken to use approximately the same amount 
of space for each set of five tallies. Cross-section paper is very useful for this purpose. 
d. The word "uniliteral" in the designation "uniliteral frequency distribution" means 


"single letter", and it is to be inferred that other types of frequency distributions may be encoun- 
tered. For example, a distribution of pairs of letters, constituting a biliteral frequency distri- 
bution, is very often used in the study of certain cryptograms in which it is desired that pairs 
made by combining successive letters be listed. A biliteral distribution of A B C D E F would 
take these pairs: AB, BC, CD, DE, EF. The distribution could be made in the form of a large 
square divided up into 676 cells. 
When distributions beyond biliteral are required (triliteral, 
quadraliteral, etc.) they can only be made by listing them in some order, for example, alpha- 
betically based on the 1st, 2d, 3d, ... letter. 


8 The arithmetical average Is obtained by adding each different lenith and dividing by the number of 
different-length messages; the mean is obtained by multiplying each different length by the number of messages 
of that length, adding all products, and dividing by the total number of messages. 


7 The use of the terms "distribution" and "frequency distribution", instead of "table" and "frequency 
table", respectively, is considered advisable from the point of view of consistency with the usual statistical 
nomenclature. 
When data are given in tabular form, with frequencies indicated by numbers, then they may 
properly be said to be set out in the form of a table. 
When, however, the same data are distributed in a chart 
which partakes of the nature of a graph, with the data indicated by horizontal or vertical linear extensions, or 
by a curve connecting points corresponding to quantities, then it is more proper to call such a graphic represen- 
tation of the data a distribution. 
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SECTION IV 


FUNDAMENTAL USES OF THE UNILITERAL FREQUENCY DISTRIBUTION 


Paragraph 


The four facts which can be determined from a study of the uniliteral frequency distribution for a crypto- 
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Determining whether the standard cipher alphabet is direct or reversed______________________________________________ 
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12. The four facts which can be determined from a study of the uniliteral frequency dis- 
tribution for a cryptogram. a. The following four facts (to be explained subsequently) can 
usually be determined from an inspection of the uniliteral frequency distribution for a given 
cipher message of average length, composed of letters: 
(1) Whether the cipher belongs to the substitution or the transposition class; 
(2) If to the former, whether it is monoalpha.betic or polyalpha.betic in character; 
(3) If monoa.lphabetic, whether the cipher alphabet is a standard cipher alphabet or a mixed 
cipher alphabet; 
· · 


(4) If standard, whether it is a direct or reversed standard cipher alphabet. 
b. For immediate purposes the first two of the foregoing determinations are quite important 
and will be discussed in detail in the next two subparagraphs; the other two determinations will 
be touched upon very briefly, leaving their detailed discussion for subsequent sections of the 
text. 
13. Determining the class to which a cipher belongs.-a. The determination of the class 


to which a cipher belongs is usually a relatively easy matter because of the fundamental difference 
in the nature of transposition and of substitution as cryptographic processes. In a transposition 
cipher .the original letters of the plain text have merely been rearranged, without any change 
whatsoever in their identities, that is, in the conventional values they have in the normal alpha- 
bet. Hence, the numbers of vowels (A, E, I, 0, U, Y), high-frequency consonants (D, N, R, S, T), 
medium-frequency consonants (B, C, F, G, H, L, M, P, V, W), and low-frequency consonants (J, K, 
Q, X, Z) are exactly the same in the cryptogram as they are in the plain-text message. Therefore, 
the percentages of vowels, high, medium, and low-frequency consonants are the same in the 
transposed text as in the equivalent plain text. In a substitution cipher, on the other hand, the 
identities of the original letters of the plain text have been changed, that is, the conventional 
values they have in the normal alphabet have been altered. Consequently, if a count is made 
of the various letters present in such a cryptogram, it will be found that the number of vowels, 
high, medium, and low-frequency consonants will usually be quite different in the cryptogram 
from what they are in the original plain-text message. Therefore, the percentages of vowels, 
high, medium, and low-frequency consonants are usually quite different in the substitution text 
from what they are in the equivalent plain text. From these considerations it follows that if in 
a specific cryptogram the percentages of vowels, high, medium, and low-frequency consonants 
are approximately the same as would be expected in normal plain text, the cryptogram probably 
belongs to the transposition class; if these percentages are quite different from those to be 
expected in normal plain te~t the cryptogram probably belongs to the substitution class. 
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b. In the preceding subparagraph the word "probably" was emphasized by italicizing it, 
for there can be no certainty in every case of this determination. 
Usually these percentages in 


a transposition cipher are close to the normal percentages for plain text; usually, in a substitu. 
tion cipher, they are far different from the normal percentages for plain text. But occasionally 
a cipher message is encountered which is difficult to classify with a reasonable degree of certainty 
because the message is too short for the general principles of frequency to manifest themselves. 
It is clear that if in actual messages there were no variation whatever from the normal vowel 
and consonant percentages given in Table 3, the determination of the class to which a specific 
cryptogram belongs would be an extremely simple matter. But unfortunately there is always 
some variation or deviation from the normal. Intuition suggests that as messages decrease in 
length there may be a greater and greater departure from the normal proportions of vowels, 
high, medium, and low-frequency consonants, until in very short messages the normal propor- 
tions may not hold at all. Similarly, as messages increase in length there may be a lesser and 
lesser departure from the normal proportions, until in messages totalling a thousand or more 
letters there may be no difference at all between the actual and the theoretical proportions. 
But intuition is not enough, for in dealing with specific messages of the length of those commonly 
encountered in practical work the question sometimes arises as to exactly how much deviation 
(from the normal proportions) may be allowed for in a cryptogram which shows a considerable 
smount of deviation from the normal and which might still belong to the transposition rather 
than to the substitution class. 
c. Statistical studies have been made on this matter and some graphs have been constructed 


thereon. These are shown in Charts 1-4 in the form of simple curves, the use of which will now 
be explained. Each chart contains two curves marking the lower and upper limits, respectively, 
of the theoretical amount of deviation (from the normal percentages) of vowels or consonants 
which may be allowable in a cipher believed to belong to the transposition class. 
d. In Chart 1, curve V1 marks the lower limit of the theoretical amount of deviation from the 
normal number of vowels to be expected in a message of given length; curve V2 marks the upper 
limit of the same thing. Thus, for example, in a. message of 100 letters in plain English there 
should be between 33 and 47 vowels (A E I 0 U Y). 
Likewise, in Chart 2 curves H1 and H2 
mark the lower and upper limits as regards the high-frequency consonants. In a message of 100 
letters there should be between 28 and 42 high-frequency consonants (D N R S T). In Chart 3, 
curves M 1 and M 2 mark the lower and upper limits as regards the medium-frequency consonants. 
In a message of 100 letters there should be between 17 and 31 medium-frequency consonants 
(B C F G H L M P V W). 
Finally, in Chart 4, curves L1 and L2 mark the lower and upper 
limits as regards the low-f:z;equency consonants. In a message of 100 letters there should be 
between O and 3 low-frequency consonants (J K Q X Z). In using the charts, therefore, one 
finds the point of interseetion of the vertical coordinate corresponding to the length of the 
message, with the horizontal coordinate corresponding to (1) the number of vowels, (2) the 
number of high-frequency consonants, (3) the number of medium-frequency consonants, and 
(4) the number of low-frequency consonants actually counted in the message. If all four points 
of intersection fall within the area delimited by the respective curves, then the number of vowels, 
high medium and low-frequency consonants corresponds with the number theoretically expected 
' 
' 
. 
. 
. . . 
in a normal plain-text message of the same length; smce the message under mvest1gation is not 
plain text, it follows that the cryptogram may certainly be classified as a transposition cipher. 
On the other hand, if one or more of these points of intersection falls outside the area delimited 
by the respective curves, it follows that the cryptogram is probably a substitution cipher. The 
distance that the point of intersection falls outside the area delimited by these curves is a more or 
less rough measure of the improbability of the cryptogram's being~ transposition cipher. 
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e. Sometimes a cryptogram is encountered which is hard to classify with certainty even with 
the foregoing aids, because it has been consciously prepared with a view to making the classifica- 
tion difficult. This can be done either by selecting peculiar words (as in "trick cryptograms") 
or by employing a cipher alphabet in which letters of approximately similar normal frequencies 
have been interchanged. For example, E may be replaced by 0, T by R, and so on, thus yielding 
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a cryptogram giving external indications of being a transposition cipher but which is really a 
substitution cipher. If the cryptogram is not too short, a close study will usually disclose what 
has been done, as well as the futility of so simple a subterfuge. 
j. In the majority of cases, in practical work, the det~rmination of the class to which a 


cipher of average length belongs can be made from a mere inspection of the message, after the 
cryptanalyst has acquired a familiarity with the normal appearance of transposition and of 
substitution ciphers. In the former case, his eyes very speedily note many high-frequency letters, 
such as E, T, N, R, 01 and S, with the absence of low-frequency letters, such as J, K, Q, X, 


, I 
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and Z; in the latter case, his eyes just as quickly note the presence of many low-frequency letters 
and a corresponding absence of the usual high-frequency letters. 
' 
g. Another rather quickly completed test, in the case of the simpler varieties of ciphers, is 
to look for repetitions of groups of letters. As will become apparent very soon, recurrences of 
syllables, entire words and short phrases constitute a characteristic of all normal plain text. 
Since a transposition cipher involves a change in the sequence of the letters composing a plain- 
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text message~ such recurrences are broken up so that the cipher text no longer will show repetitions 
of more or less lengthy sequences of letters. But if a cipher message does show many repetitions 
and these are of several letters in length, say over four or five, the conclusion is at once warranted 
that the c17pto~ram is most probably a substitution and not a transposition cipher. 
Ho~ever, 


for the begrnner m cryptanalysis, it will be advisable to make the uniliteral frequency distribution, 
and note the frequencies of the vowels, the high, medium, and low-frequency consonants. Then, 
referring to Charts 1 to 4, he should carefully note whether or not the observed frequencies for 
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these categories of letters fall within the limits of the theoretical frequencies for a normal plain- 
text message of the same length, and be guided accordingly. 
h. It is obvious that the foregoing rule applies only to ciphers composed wholly of letters. 
If a message is composed entirely of figures, or of arbitrary signs and symbols, or of intermixtures 
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of letters, figures and other symbols, it is immediately apparent that the cryptogram .is a sub- 
stitution cipher. 
i. Finally, it should be mentioned that there are certain kinds of cryptograms whose clas,s 


cannot be determined by the method set forth in subparagraphs b, c, d above. These exceptions 
will be discussed in a subsequent section of this text. 1 
14. Determining whether a substitution cipher is monoalphabetic or polyalphabetic.-a. It 
will be remembered that a monoalphabetic substitution cipher is one in which a single cipher 
alphabet is employed throughout the whole message, that is, a given plain-text letter is invariably 


1 Par. 47. 
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represented throughout the message by one and the same letter in the cipher text. On the other 
hand, a polyalphabetic substitution cipher is one in which two or more cipher alphabets are 
employed within the same message; that is, a given plain-text letter may be represented by two or 
more different letters in the cipher text, according to some rule governing the selection of the 
equivalent to be used in each case. From this it follows that a single cipher letter may represent 
two or more different plain-text letters. 
b. It is easy to see why and how the appearance of the uniliteral frequency distribution for 


a substitution cipher may be used to determine whether the cryptogram is monoalphabetic or 
polyalphabetic in character. The normal distribution presents marked crests and troughs by 
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virtue of two circumstances. First, the elementary sounds which the symbols represent are 
used with greatly varying frequencies, it being one of the striking characteristics of every alpha- 
betic language that its elementary sounds are used with greatly varying frequencies. 2 In' the 
second place, except for orthographic aberrations peculiar to certain languages (conspicuously, 
English and French), each such sound is represented by the same symbol. It follows, therefore, 
that since in a monoalphabetic substitution cipher each different plain-text letter (=elementary 
sound) is represented by one and only one cipher letter (=elementary symbol), the uniliteral 
frequency distribution for such a cipher message must also exhibit the irregular crest and trough 
appearance of the normal distribution, but with only this important modification-the ab,solute 


2 The student who is interested in this phase of the subject may find the following reference of value: Zipf' 


G. K., Selected Studies of the Principle of Relative Frequency in Language, Cambridge, Mass., 1932. 
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positions of the crests and trough8 'IJJi1l not be the aame a8 in the normal,. 
That is, the letters accom- 
panying the crests and the troughs in the distribution for the cryptogram will be different from 
those accompanying the crests and the troughs in the normal distribution. But the marked 
irregularity of the distribution, the presence of accentuated crests and troughs, is in itself an 
indication that each symbol or cipher letter always represents the same plain-text letter in that 
cryptogram. Hence the general rule: A marked crest and trough appearance in the uniliteral 
frequency distribution for a given cryptogram indicates that a single cipher alphabet is involved and 
constitutes one of the tests for a monoalphabetic substitution cipher. 
c. On the other hand, suppose that in a cryptogram each cipher letter represents several 
different plain-text letters. Some of them are of high frequency, others of low frequency. The 
net result of such a situation, so far as the uniliteral frequency distribution for the cryptogram 
is concerned, is to prevent the appearance of any marked crests and troughs and to tend to reduce 
the elements of the distribution to a more or less common level. This imparts a "flattened 
out" appearance to the distribution. For example, in a certain cryptogram of polyalphabetic 
construction, K.=Ei,, Gp, and Jp; R0=AP, Op, and BP; X.=Op, Lp, and FP. 
The frequencies of 


K., R., and X. will be approximately equal because the summations of the frequencies of the several 
plain-text letters each of these cipher letters represents at different times will be about equal. 
If this same phenomenon were true of all the letters of the cryptogram, it is clear that the 
frequencies of the 26 letters, when shown by means of the ordinary uniliteral frequency distribu- 
tion, would show no striking differences and the distribution would have the flat appearanc<> of 
a typical polyalpha.betic substitution cipher. Hence, the general rule: 
The absence of marked 


crests and troughs in the uniliteral frequency distribution indicates that two or more cipher alphabets 
are involDed. 
The jloJtened-out appearance of the distribution constitutes one of the tests for a poly- 
alphabetic 8'Ubstitution cipher. 
d. The foregoing test based upon the appearance of the frequency distribution constitutes 
only one of several means of determining whether a substitution cipher is monoalphabetic or 
polyalphabetic in composition. It can be employed in cases yielding frequency distributions 
from which definite conclusions can be drawn with more or less certainty by mere ocular exami- 
nation. In those cases in which the frequency distributions contain insufficient data to permit 
drawing definite conclusions by such examination, certain statistical tests can be applied. These 
will be discussed in a subsequent text. 


e. At this point, however, one additional test will be given because of its simplicity of appli- 
cation. It may be employed in testing messages up to 200 letters in length, it being assumed that 
in messages of greater length ocular examination of the frequency distribution offers little or no 
difficulty. This test concerns the number of blanks in the frequency distribution, that is, the 
number of letters of the alphabet which are entirely absent from the message. It has been 
found from statistical studies that rather definite "laws" govern the theoretically expected num- 
ber of blanks in normal plain-text messages and in frequency distributions for cryptograms of 
different natures and of various sizes. The results of certain of these studies have been embodied 
in Chart 5. 
j. This chart contains two curves. The one labeled P applies to the average number of 
blanks theoretically expected in frequency distributions based upon normal plain-text messages 
of the indicated lengths. The other curve, labeled R, applies to the average number of blanks 
theoretically expected in frequency distributions based upon perfectly random assortments of 
letters; that is, assortments such as would be found by random selection of letters out of a hat 
containing thousands of letters, all of the 26 letters of the alphabet being present in equal pro- 
portions, each letter being replaced after a record of its selection has been made. Such random 
assortments correspond to polyalphabetic cipher messages in which the number of cipher alpha- 
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bets is so large that if uniliteral frequency distributions are made of the letters the distributions 
are practicall! identical with those which are obtained by random selections of l~tters out of a hat. 
g. In usmg this chart, one finds the point of intersection of the vertical coordinate corre- 
sponding to the length of the message, with the horizontal coordinate corresponding to the 
observed number of blanks in the distribution for the message. If this point of intersection falls 
closer to curve P than it does to curve R, the number of blanks in the message approximates or 
corresponds more cl?sely to the number theoretically expected in a plain-text message than it 
does to a ra~dom (cipher-text) message of the same length; therefore, this is evidence that the 
cryptogram is monoalphabetic. Conversely, if this point of intersection falls closer to curve R 
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messages of various lengths. 
(See Par. 14/.) 


than to curve P, the ~umber of blan~s in the message approximates or corresponds more closely 
to the number theoretically expected ma random text than it does to a plain-text message of the 
same length; therefore, this is evidence that the cryptogram is polyalphabetic. 
h. Practical examples of the use of this chart will be given in some of the illustrative messages 
to follow. 
15. Determining whether the cipher alpha.bet is a standard, or a mixed cipher alphabet.- 
a. As~uming that the uniliteral frequency distribution for a given cryptogram has been made, and 
that it shows clearly that the cryptogram is a substitution cipher and is monoalphabetic in 
character, a consideration of the nature of standard cipher alphabets 3 almost makes it obvious 
how an inspection of the distribution will disclose whether the cipher alphabet involved is a 
standard cipher alphabet or a mixed cipher alphabet. If the crests and troughs of the distribu- 


• See Sec. VIII, Elementary Military Cryptography, 
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tion occupy positions which correspond to th~ relative posit~ons they occupy in ~h~ normal 
frequency distribution, then the cipher alphabet is a standard cipher alphabet. If this is no~ the 
case, then it is highly probable that the cryptogram has been prepared by the use of a mrxed 


cipher alphabet. 
· 
·1 bl 
b. A mechanical test may be applied in dou?tful cas_es aris~g fro~ la~k of ~aten~l av~i a e 
for study. Just what this test involves, and an illustration of its apphcation will be given m the 
next section using specific examples. 
. 
16. De;erming whether the standard cipher alphabet is direct or reversed.-Assu~g 


that the frequency distribution for a given cryptogram shows cle?Xl~ that a standard cipher 
alphabet is involved, the determination as to whether the alphabet is direct or reversed c?-11 a~so 
be made by inspection, since the difference between the two is merely a ~atter of the di~ection 
in which the sequence of crests and troughs progresse~-t~ th? rig~t, as m normal readmg or 
writing, or the left. In a direct cipher alphabet the. direction m whic~ the. crests and tr~ughs 
of the distribution should be read is the normal direction, from left to nght; m a reversed cipher 
alphabet this direction is reversed, from right to left. 
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17. Principles of solution by construction and analysis of the uniliteral frequency distri- 


bution.-a. Standard cipher alphabets are of two sorts, direct and reversed. The analysis of 
monoalphabetic cryptograms prepared by their use follows almost directly from a consideration of 
the nature of such alphabets. Since the cipher component of a standard cipher alphabet consists 
either of the normal sequence merely displaced 1, 2, 3, ... intervals from the normal point of 
coincidence, or of the normal sequence proceeding in a reversed-normal direction, it is obvious 
that the uniliteral frequency distribution for a cryptogram prepared by means of such a cipher 
alphabet empk1yed monoalphabetically will show crests and troughs whose relative positions 
ahd frequencies will be exactly the same as in the uniliteral frequency distribution for the plain 
text of that cryptogram. The only thing that has happened is that the whole set of crests and 
troughs of the distribution has been displaced to the right or left of the position it occupies in the 
distribution for the plain text; or else the successive elements of the whole set progress in the 
opposite direction. Hence, it follows that the correct determination of the plain-text value of the 
letter marking any crest or trough of the uniliteral frequency distribution will result at one 
stroke in the correct detennination of the plain-text values of all the remaining 25 letters respec- 
tively marking the other crests and troughs in that distribution. Thus, having determined the 
value of a single element of the cipher component of the cipher alphabet, the values of all the 
remaining letters of the cipher component are automatically solved at one stroke. In more 
simple language, the correct determination of the value of a single letter of the cipher text 
automatically gives the values of the other 25 letters of the cipher text. The problem thus 
resolves itself into a matter of selecting that point of attack which will most quickly or most 
easily lead to the determination of the value of one cipher letter. The single word identification 
will hereafter be used for the phrase "determination of the value of a cipher letter"; to identify a 
cipher letter is to find its plain-text value. 
b. It is obvious that the easiest point of attack is to assume that the letter marking the crest 
of greatest frequency in the frequency distribution for the cryptogram represents EP. 
Proceeding 
from this initial point, the identifications of the remaining cipher letters marking the other crests 
and troughs are tentatively made on the basis that the letters of the cipher component proceed 
in accordance with the normal alphabetic sequence, either direct or reversed. If the actual 
frequency of each letter marking a crest or a trough approximates to a fairly close degree the 
normal theoretical frequency of the assumed plain-text equivalent, then the initial identification 
e0 =EP may be assumed to be correc{and therefore the derived identifications of the other cipher 
letters may be assumed to be correct. If the original starting point for assignment of plain-text 
values is not correct, or if the direction of "reading" the successive crests and troughs of the 
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distribution is not correct, then the frequencies of the other 25 cipher letters will not correspond 
to or even approximate the normal theoretical frequencies of their hypothetical plain-text equiva- 
lents on the basis of the initial identification. A new initial point, that is, a different cipher 
equivalent must then be selected to represent EP; or else the direction of "reading" the crests and 
troughs must be reversed. This procedure, that is, the attempt to make the actual frequency 
relations exhibited by uniliteral frequency distribution for a given cryptogram conform to the 
theoretical frequency relations of the normal frequency distribution in an effort to solve the 
cryptogram, is referred to technically as "fitting the actual uniliteral frequency distribution for a 
cryptogram to the thoretical uniliteral frequency distribution for normal plain text", or, more 
briefly, as "fitting the frequency distribmionjor the cryptogram to the normal, frequency distribmion", 
or, still more briefly, ''fitting the distrib-ution to the normal,." In statistical work the expression 
commonly employed in connection with this process of fitting an actual distribution to a the- 
oretical one is "testing the goodness of fit." The goodness of fit may be stated in various ways, 
mathematical in character. 
c. In fitting the actual distribution to the normal, it is necessary to regard the cipher com- 
ponent (that is, the letters A .•. Z marking the successive crests and troughs of the distribution) 
as partaking of the nature of a wheel or sequence closing in upon itself, so that no matter with 
what crest or trough one starts, the spatial and frequency relations of the crests and troughs are 
constant. This manner of regarding the cipher component as being cyclic in nature is valid 
because it is obvious that the relative positions andfrequencies of the crests and troughsojanyuniliteral,. 
frequency distrib'ldion must remain the same regardJ,ess of what letter is employed as the initial, point 
of the distrib'ldion. Fig. 5 gives a clear picture of what is meant in this connection, as applied to 
the normal frequency distribution. 
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FIGURE II. 


d. In the third sentence of subparagraph b, the phrase "assumed to be correct" was ad- 


visedly employed in describing the results of the attempt to fit the distribution to the normal, 
because the final test of the goodness of fit in this connection (that is, of the correctness of the 
assignment of values to the crests and troughs of the distribution) is whether the consistent 
substitution of the plain-text values of the cipher characters in the cryptogram will yield intelli- 
gible plain text. If this is not the case, then no matter how close the approximation between 
actual and theoretical frequencies is, no matter how well the actual frequency distribution fits 
the normal, the only possible inferences are that (1) either the closeness of the fit is a pure coin- 
cidence in this case, and that another equally good fit may be obtained from the same data, or 
else (2) the cryptogram involves something more than simple monoalphabetic substitution by 
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means of a single standard cipher alphabet. For example, suppose a transposition has been 
applied in addition to the substitution. Then, although an excellent correspondence between 
the uniliteral frequency distribution and the normal frequency distribution has been obtained 
the substitu~ion of the cipher letters by their assumed equivalents will still not yield plain text: 
However, aside from such cases of double encipherment, instances in which the uniliteral fre- 
quency distribution may be easily fitted to the normal frequency distribution and in which at 


th~ same t~me an ~ttempted ~imple substitution fails to yield intelligible text are rare. It may be 
said that, m practical operations whenever the uniliteral frequency distribution can be made to 
fit the normal frequency distribution, substitution of values will result in solution· and as a 
corollary, whenever the uniliteral frequency distribution cannot be made to fit the n~rmal 
frequency distribution, the cryptogram does not represent a case of simple monoalphabetic 
substitution by means of a standard alphabet. 
' 
. 18. Theoretical example of solution.-a. The foregoing principles will become clearer by 
notmg the cryptographing and solution of a theoretical example. The following message is to be 
cryptographed. 


HOSTILE FORCE.ESTIMATED AT ONE REGIMENT INFANTRY AND TWO PLATOONS CAVALRY 
MOVING SOUTH ON QUINNIMONT PIKE STOP HEAD OF COLUMN NEARING ROAD JUNCTION SEVEN 
THREE SEVEN COMMA EAST OF GREENACRE SCHOOL FIRED UPON BY OUR PATROLS STOP 
HA VE DESTROYED BRIDGE OVER INDIAN CREEK . 
. b .• First, s~lely for pu~oses of demonstrating certain principles, the uniliteral frequency dis- 
tnbution for this message is presented in Figure 6. 


FIGURE 6. 


. 
c. Now let t~e f~regoing message be cryptographed monoalphabetically by the following 


cipher alphabet, yielding the cryptogram and the frequency distribution shown below. 


Plain _____________ A B C D E F G H I J K L M N 0 P Q R S T U V W X Y Z 
Cipher ___________ G H I J K L M N 0 P Q R S T U V W X Y Z A B C D E F 


Plain _______________ HOSTI 
LE FOR 
CEEST 
!MATE 
DATON. EREGI 
MENTI 
NF ANT 
RYAND 


Cipher _____________ NUYZO 
RKLUX 
IKKYZ 
OSGZK 
JGZUT 
KXKMO 
SKTZO 
TLGTZ 
XEGTJ 


Plain _______________ TWOPL 
ATOON 
SCAVA 
LRYMO 
VINGS 
OUTHO 
NQUIN 
NIMON 
TPIKE 
Cipher ____________ ZCUVR 
GZUUT 
YIGBG 
RXESU 
BOTMY 
UAZNU 
TWAOT 
TOSUT 
ZVOQK 


Plain _______________ STOPH 
EADOF 
COLUM 
NNEAR 
INGRO 
ADJ UN 
CTI ON 
SEVEN 
THREE 


Cipher ____________ YZUVN 
KGJUL 
IURAS 
TTKGX 
OTMXU 
GJPAT 
IZOUT 
YKBKT 
ZNXKK 


Pfain ______________ SEVEN 
COMMA 
EASTO 
FGREE 
NACRE 
SCHOO 
LFIRE 
DUPON 
BY OUR 
Cipher ____________ YKBKT 
IUSSG 
KGYZU 
LMXKK 
TGIXK 
YINUU 
RLOXK 
JAVUT 
HEU AX 


Plain _______________ PATRO 
LSSTO 
PHAVE 
DESTR 
OYEDB 
RIDGE 
OVER! 
ND IAN 
CREEK 


Cipher _____________ VGZXU 
RYYZU 
VNGBK 
JKYZX 
UEKJH 
XOJMK 
UBKXO 
TJOGT 
IXKKQ 
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CRYPTOGRAM 


NUYZO 
RKLUX 
IKKYZ 
0 S G Z K 
JGZUT 
KXKMO 
SK T Z 0 
TLGTZ 
XE GT J 
ZCUVR 
GZUUT 
YI GB G 


RXESU 
B 0 TM Y 
UAZNU 
TWA 0 T 
T 0 S U T 
Z V 0 Q K 


Y·ZUVN 
KG JUL 
I UR AS 
TTKGX 
OTMXU 
G JP AT 


I Z 0 UT 
Y.K BK T 
ZNXKK 
YKBKT 
I U S S G 
KGYZU 


LMXKK 
T GI X K 
YIN U U 
R L 0 X K 
J AV UT 
HEUAX 
VGZXU 
RYYZU 
VNGBK 
JKYZX 
UEKJH 
XOJMK 
UBKXO 
T J 0 GT 
IX K K Q 


J'IGUIB 7 


d Let _,the st~dent n~w compare Figs. 6 _and 7, which have been superimposed in Fig. 8 


fer co~venienee in e.::tamin&tion. crests-and troughs a.re present in both distributions; moreover 
their rela.tive positions e.nd frequencies have not been cha.ng~d in the ~~htest particular •. O~y 
the absolute position of the sequence as a whole has been displaced SIX mterva.ls to the nght m 
Fig. 7 
1 as compared with the absolute position of the sequence in Fig. 6. 
_ 


FIGVBll 8. 


e. If the two distributions a.re compared in detail the student will clearly ~derstand how 
easy the SQlution of the cryptogram would be to one who kn~w not~~ abo~t how it w~ prepared. 
For example, the frequency of the highest crest, rep~esentm~ EP m Fig. 6 is 28; at an mterval of 
four letters before Ep there is another crest representmg AP with frequency 16. Be.tween A and E 
there is a trough, representing the low-frequency letters B, C, D. 
On the other side of E, at an 


interval of four letters, comes another crest, representing I with frequency 14. Between E and I 
there is another trough, representing the low.,-frequency letters F, G, H. Compare these crests 
and troughs with their homologous crests and troughs in Fig. 7. In the latter, the letter ~ 
marks the highest crest in the distribution with a frequency of 28; four letters before K there is 
another crest, frequency 16, and four letters on the other side of K there is another crest, frequency 
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14. Troughs corresponding to B, C, D and F, G, Ff are seen at H, I, J and L, M; Nin Fig. '7. In 
fact, the two distributions may be made to coincide exactly, by shifting the frequency distribution 
for the cryptogram six intervals to the left with respcet to the distribution for the equivalent 
plain-text message, as shown herewith. 


F'rGtlU9. 


f. Let us suppose now that nothing is known about the cryptographing process, and that 
only the cryptogram and its uniliteral frequency distribution is at hand. It is clear that simply 
bearing in mind the spatial relations of the crests and troughs in a normal frequency distribution 
would enable the cryptanalyst to fit the distribution to the nol"D,lal in this case. He would 
naturally first assume that G0=Ap1 from which it would follow that if a. direct standard alphabet 
is involved, H0=Bp1 I 0=Cp, and so on, yielding the following (tentative) deciphering alphabet: 


Cipher ____________ A B C D E F G H I J K L M N 0 P Q R S T U V W X Y Z 
Plain____________ U V W X Y Z A B C D E F G H I J K L M N 0 P Q R S T 


g. Now comes the final test: If these assumed values are substituted in the cipher text, 


the plain text immediately appears. Thus: 


NUYZO 
RKLUX 
IKKYZ 
OSGZK 
JGZUT 
~ 


H 0 S T I 
L E F 0 R 
C E E S T 
I M A T E 
D A T 0 N 
etc. 


k. It should be clear, therefore, that the selection of G0 to represent Ap in the cryptogre.phing 


process has absolutely no effect upon the relative spatial and frequency relations of the crests 
and troughs, of the frequency distribution for the cryptogram, If Q0 had been selected to repre- 
sent AP, these relations would still remain the same, the whole series of crests and troughs being 
merely displaced further to the right of the positions they occupy when G0=AP. 


19. Practical example of solution by the frequency method.--a. The case of direct standard 


alphabet ciphers.-(1) The following cryptogram is to be solved by applying the foregoing 
principles: 
I B M Q 0 
Z WI L N 
PB I U 0 
QT TM L 
MB BG A 


E QB PU 


J C Z 0 F 
I Z K P Q 


MU U QB 


V 0 Q V N 


A J C Z 0 
I V B Z G 


(2) From the presence of repetitions and sd many low-frequency letters such as B, Q, and 


Zit is at once suspected that this is a substitution cipher. But to illustrate the steps that must 
be tak.en in difficult cases in order to be certain in this respect, a uniliteral frequency distribution 
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he will note that the relative positions and extensions of the crests and troughs are identical; 
they merely progress in opposite directions. 


20. Solution by completing the plain-component sequence.--a. The case of direct. 8'1andard 


alphabet ciphers.-(1) The foregoing method of analysis, involving as it does the construction of 
a uniliteral frequency distribution, was termed a solution by the frequency method because it in- 
volves the construction of a frequency distribution and its study. There is, however, another 
method which is much more rapid, almost. wholly mechanical, and which, moreover, does not 
necessitate the construction or study of any frequency distribution whatever. An understand- 
ing of the method follows from a consideration of the method of encipherment of a message 
by the use of a single, direct standard cipher alphabet. 
(2) Note the following encipherment: 


Message _________ REPEL INVADING CAVALRY 


ENCIPHERING ALPHABET 


Plain_____________ A B C D E F G H I J K L M N 0 P Q R S T U V W X Y Z 
Cipher ___________ G H I J K L M N 0 P Q R S T U V W X Y Z A B C D E F 


·Plain text-----~- R E P E L 
Cryptogram____ X K V K R 


ENCIPBERMENT 


.I NV AD I NG 
OTBGJOTM 


CRYPTOGRAM 


C· AV ·AL RY 
IGBGRXE 


XKVKR 
OTBGJ 
OT·MIG 
BG.RXE 


(3) The enciphering alphabet shown above repre8ents a case wherein the sequence of letters 
of both components of the cipher alphabet is the normal sequence, with the sequence fonning the 
cipher component merely shifted six intervals in retard (or 20 intervals in advance) of the posi- 
tion it occupies in the normal alphabet. If, therefore, two strips of paper bearing the letters of 
the normal sequence, equally spaced, are regarded as the two components of the cipher alphabet 
and are juxtaposed at all of the 25 possibl~ points of coincidence, it is obvious that one of these 
25 juxtapositions must ·correspond tci the actual juxtaposition shown in the enciphering alphabet 
directly above.2 It is equally obvious that if a record were kept of the results obtained by ap- 
plying the values·given at each juxtaposition to the letters of the cryptogram, one of these results 
would yield the plain text of the cryptogram. 
(4) Let the work be systematized and the results set down in an orderly manner for exam- 


ination. It is obviously unnecessary to juxtapose the two components so that A0=Ap, for on 
the assumption of a direct standard alphabet, juxtaposing two direct norm.al components at 
their normal point of coincidence merely yields plain text. The next possible juxtaposition, 
therefore, is A0=BP. Let the juxtaposition of the two sliding strips therefore be A0 =6p, as shown 
here: 


Plain_______________ ABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZ 
Cipher_____________ 
ABCDEFGHIJKLMNOPQRSTUVWXYZ 


The values given by this juxtaposition are substituted for the first 20 letters of the cryptogram 
and the following results are obtained. 
Cryptogram__________________ X K V K R 
0 T B G J 
0 T M I G 
B G R X E 
1st Test-"Plain text"---- Y L W L S 
P U C H K 
P U N J H 
C H S Y F 


1 One of the strips should bear the sequence repeated. This permits juxtaposing the two sequences at all 26 
possible points of coincidence so &11 tQ ~~v~ ~ complete cipher alphabet showing at all times. 
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This certainly is not intelligible text; obviously, the two components were not in the position 
indicated in this first test. The cipher component is therefore slid one interval k> the right 
making Ac=Cp, and a second test is made. Thus 
. 
' 
Plain_______________ ABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZ 
Cipher_____________ 
ABCDEFGHIJKLMNOPQRSTUVWXYZ 


Cryptogram__________________ X K V K R 
0 T B G J 
0 T M I G 
B G R X E . 


2d Test-"Plain text"----- Z M X M T 
Q V D I L 
Q V 0 K I 
D I T Z G 
Neither does the second test result in disclosing any plain text. But, if the results of the two 
tests are studied a phenomenon that at first seems quite puzzling comes to light. ·Thus, suppose 
the results Of the two tests are superimposed in this fashion. 
·Cryptogram __________________ X K V K R 
0 T B G J 
0 T M ! G 
B .G .R X· E:. 


1st Test-"Plain text"____ Y L W L S 
P U C H K 
P U N J H 
C H S Y F 
2nd Test-"Plain text" --- Z M X M T 
Q V D I L 
Q V 0 K I 
D I T Z G 
(5) Note what has happened. The net result of the two experiments was merely to continue 
the normal sequence begun by the cipher letters at the heads of the several columM. It is 
obvious that if the normal sequence is completed in each column the results will be exactly the same 
as though the whole set of ~5 possible tests had actually been performed. I.et the columns therefore 
be completed, as shown in Fig. 11. 


XKVKROTBGJOTMIGBGRXE 
YLWLSPUCHKPUNJHCHSYF 
ZMXMTQVDILQVOKIDITZG 
ANYNURWEJMRWPLJEJUAH 
BOZOVSXFKNSXQMKFKt~I 
CPAPWTYGLOTYRNLGLWCJ 
DQBQXUZHMPUZSOMHMXDK 
ERCRYVAINQVATPNINYEL 
FSDSZWBJ ORWBUQOJ OZFM 
GTETAXCKPSXCVRPKPAGN 
HUFUBYDLQTYDWSQLQBHO 
IVGVCZEMRUZEXTRMRCIP 
JWHWDAFNSVAFYUSNSDJQ 
KXIXEBGOTWBGZVTOTEKR 
LYJYFCHPUXCHAWUPUFLS 
MZKZGDIQVYDIBXVQVGMT 
NALAHEJRWZEJCYWRWHNU 
OBMBIFKSXAFKDZXSXIOV 
PCNCJGLTYBGLEAYTYJPW 
QDODKHMUZCHMFBZUZKQX 
*R E P E L I N V A D I N G C A V A L R Y 
SFQFMJOWBEJOHDBWBMSZ 
TGRGNKPXCFKPIECXCNTA 
UHSHOLQYDGLQJFDYDOUB 
VI TI PMRZEHMRKGEZEPV C 
W JU J Q NS AF.INS L HF AF Q W D 


FIGU:U 11. 


An examination of the successive horizontal lines of the diagram discloses OM and on.ly one. line 
of plain text, that marked by the asterisk and reading R E P E L I N V A D I N G C A V A L R Y • 
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(6) Since each: column in Fig. 11 is nothing but a normal sequence, it is obvious that instead 
of laboriously writing down these columns of letters every time a cryptogram is to be examined, 
it would be more convenient to prepare a set of strips ea.eh bearing the normal sequence doubled 
(to permit complete coincidence for an entire alphabet at any setting), and have them available 
for examining any future cryptograms. In using such a set of sliding strips in order to solve a 
cryptogram prepared by means of a single direct standard cipher alphabet, or to make a test to 
deterniin.e whether a cryptogram has been so prepared, it is only necessary to "set up" the letters 
of the cryptogram on the strips, that is, align them in a single row across the strips (by sliding 
the individual strips up or down). The successive horizontal lines, called gemra,trices (singular, 
generatrix), are then examined in a search for intelligible text. If the cryptogram really belongs 
to this simple type of cipher, one of the genera.trices will exhibit intelligible text ell the way 
across'; this text will practically invariably be the plain text of the message. This inelihod of 
analysis m~y be termed a solution by compl,eting the plain-component sequence. Sometimes it is 
referred to as "running down" the sequ~~e. The principle upon which the metho.d is based 


constit~ .. one of.the cryptanalyst'!! most valuable tools.3 
. 
. 
. b. ,\~ "4!!~ (Tj rever,aed 8'a1ulard aJ.ph~ets.-,-(1} The method described under subpar. a may 


~ 
be; awlli,>41 in. sligh~ly mo~e4 fo~ in the' ~a.se of a cryptogram enciphered by a single 
reversed standard alphabet. The basic principles are identical in .the. two. cases. 
(2) To show this itc:isn.~a.1'Y;io experimen~ with two sliding ~mponents as before, except 
that in this case one of::ttre COIUP'01l®ti8ll),U$tbe-al'.eversed nonnltl sequence, the other, a direct 
normal sequence. 
~ · 
· . 
. 
. . 
. , 


(3) Let the two components be juxtaposed A to A, as shown below? and then let the resultant 


\Talues be substituted f~r the letters .of the cryptogram. Thus: 
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'CRYPTOGRAM'. 


PC RC V 
Y ~-LG D 
YT A E G .LG VP I 


Plain ________________ ··ABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZ 
Cipher______________ 
ZYXWVUTSRQPONMLKJIHGFEDCBA 
. 


Cryptogram--------•---"-··-- , P c R .c . V 
Y T L G D , :Y . T A E G 
L G V P I 
1st Test-"Plain text" --- L· Y J Y F 
CH PU X 
C H·A WU 
PU FL S 
. 
' 


(4) This does not yield.intelligible text, and therefore the reversed component is slid one 
space forward and a second test is made. Thus: 


Plain_______________ ABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZ 
Cipher_____________ 
ZYXWVUTSRQPONMLKJIHGFEDCBA 


Cryptogram __________________ P C R C .V 
· Y T L G D 
Y ·T A E G 
LG VP I 
2d Test-"Plain text"---- M Z K Z G 
D I Q V Y 
D I B X V 
Q VG MT 


(5) Neither does the second test yield intelligible text. But let the results of the two tests 
be superimposed. Thus: 


Cryptogram__________________ P C R C V 
1st Test-"Plain text" ___ L Y J Y F 
2d Test-"Plain text"---- M Z K Z G 


Y :r L G D 
CH PU X 
DI Q VY 


YT A E G 
CH AW U 
DI BX V 


L G V P I 
PU FL S 
Q VG MT 


1 It is recommended that the student prepare a set of 25 strips !4 by * 
by 15 inches, made of well-seasoned 


wood, and glue alphabet strips to the wood. The alphabet on each strip should be a double or repeated alphabet 
With all letters equally spaced. 
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.(6) It is seen that the letten of the "'plain text" giYen by the second trial are merely the 


~ntinuants of the normal sequences initiated by the letters of the "plain text" given by the first 
tnal. If these sequences are "run down"-that is, completed within the columns-the results 
must obviously be the same as though successive tests exactly similar to the first two were 
a.pplied to the cryptogram, using one reversed normal and one direct normal component. If the 
oryptogram .has really ~een prepared by means of a single reversed standard alphabet, one of 
the genera.trices of the diagram that results from completing the sequences mmt yield intelligible 
text. 
(7) Let the diagram be made, or better yet, if. the student has already at hand the set of 


s]idins: strips. referred to in th~ footnote ~ page 36, let him "set up'" the letters given by the 
first trial .. Fig. 12 shows ·the diagram and mdicates the plain-text generatrix. 


PC RC VY TL G.D YT A EGL G VP I 
LYJYFCHPUXCHAWUPUFLS 
MZKZGDIQVYDIBXVQVGMT 
N A L AH E J R W Z E J C Y W.R W H N U 
OBMBIFKSXAFKDZXSXIOV 
PCNCJGLTYBGLEAYTYJPW 
QDODKHMUZCHMF8ZUZKQX 


*R E P E L I N V A D I N G C A V A L R Y 
SFQFMJOWBEJOHDBWBMSZ 
TG~GNKPXCFKPIECXCNTA 
UH SH 0 L Q YD G L Q J FD YD O.U B 
VITIPMRZEHMRKGEZEPVC 
W J U J Q N S A F I N S L H F A F Q W D 
X K V K R 0 T B G J 0 T M I G B G R X E 
YLWLSPUCHKPUNJHCHSYF 
ZMXMTQVDILQVOKIDITZG 
ABYNURWEJMRWPLJEJUAH 
BOZOVSXFKNSXQMKFKVBI 
CPAPWTYGLOTYRNLGLWCJ 
DQBQXUZHMPUZSOMHMXDK 
ERCRYVAINQVATPNINYEL 
FSDSZWBJORWBUQOJOZFM 
G T E T A X C K P S X C V R P K P A .G N 
HUFUBYDLQTYDWSQLQBHO 
IVGVCZEMRUZEXTRMRCIP 
JWHWDAFNSVAFYUSNSDJQ 
K X I X E B G 0 T W B G Z V T 0 T E K R 


FIOURll: 12. 


. 
(8) The only differ~nce in procedure between this case and the preceding one (where the 


cipher alphabet was a. direct standard alphabet) is that the letters of the cipher text are first 
"deciphered" by means of any reversed standard alphabet and then the columns are "run down", 
according to the normal A B C • • • Z sequence. For reasons which will become apparent very 
soon, the first step in this method is technically termed converting the cipher lettera into their 
plain-component equivalents; the second step is the same as before, viz, completing the plain-com- 
ponent sequence. 
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21. Special remarks on the method of solution by completing the pl-ain-oomponent sequence.- 
a. The terms employed to designate the steps in the solution set forth in Par. 20b, viz, "con- 
verting the cipher letters into their plain-component equivalents" and "completing the plain- 
component sequence", accurately describe the process. Their meaning will become more clear 
as the student progresses with the work. · It may be said that whenever the plain component of 
a cipher alphabet is a k'M'IDn sequence, no matter how it is composed, the difficulty and time 
required to solve any cryptogram involving the use of that plain component is practically out 
in half. In some cases this knowledge facilitates, and in other cases is the only thing that makes 
possible the solution of a very short cryptogram that might otherwi,se defy solution.. Later on an 
example will be given to illustrate what is meant in this regard. 
b. The student should take note, however, of two qualifying expressions that were employed 
in a preceding paragraph to describe the result!! of the application of the method. It was stated 
that "one of the genera.trices will exhibit intelligible text all the way across; this text will practically 
invariably be the plain text." ·wm there ever be a case in which more than one generatrix will 
yield intelligible text throughout its extent? · · That obviously depends almost entirely on the 
number of letters that are aligned to f t)rm a generatrix. If a generatrix contains but a very few 
letters, only five, for exii.iliple, it may happen as a result of pure chance that there will be two or 
more genera.trices showmg wh~t Dl.ight"be "mtelligible text." Note'.iri Fig. 11, for example, that 
there are several cases:m:;which a.letter and 4-letter English -words (ANY, VAIN, GOT, TIP, etc.) 
appear on genera.trices thQt are not eorreet, these words b~ing formed by plire cha.nee. But there 
is not a single case, in· t~ diagram,· of a -5-letier or longer won! appearing fortuitously, because 
obviously the longer the word 'the smaller the p:P<>ba.bility of. its appearance purely by chance; 
and the probability that two generatricies of 15. letters eaeh will both yield intelligible text along 
their entire length is exceedingly remote, BO remote, in fact, that in:: practical cryptography such 
a case may be considered nonexistent.• · 
· 


c. The student should ob!erve·that in reality there is no difference whatsoever in principle 
between the two methods presented in subpars: a and b of Par. 20. In the former the preliminary 
step of converting the cipher letters lli.to their plain-component equivalents is apparently not 
present but in reality it is there. The reason for its apparent absence is that in that case the 
plain component of the cipher aljihabet is identical in all respects with the cipher component, so 
that the cipher letters require no conversion, or, rather, they are identical with the equivalents 
that would result if they were converted on the bl'l.sis A0 =AP. In fact, if the solution process had 
been arbitrarily initiated by canverting the cipher letters into their plain-component equivalents 
at the setting A0 =0p, for example, and the cipher component slid one interval to the right there- 
after, the results of the first and second tests of Par. 20a would be as follows: 


Cryptogram________________________ X K V K R 0 T B G · J 0 T M I G B G R X E 
1st Test-"Plain text"-------- L Y J Y F C H P U X C H A W U P U F L S 
2nd Test-"Plain text"------- M Z K Z G D I Q V Y D I B X V Q V G M T 


Thus, the foregoing diagram duplicates in every particular the diagram resulting from the first 
two tests under Par. 20b: a first line of cipher letters, a second line of letters derived from them 
but showing externally no relationship with the first line, and a third line derived immediately 
from the second line by continuing the direct normal sequence. This point is brought to attention 
only for the purpose of showing that a single, broad principle is the basis of the general method of 
solution by completing the plain-component sequence, and once the student has this firmly in 


' A person with patience and an inclination toward the curiosities of the science might construct a text of Ui 


or more letters which would yield two "intelligible" texts on the plain-component completion diagram. 
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mind he will have no difficulty whatsoever in realizing when the principle is applicable, what a 
powerful cryptanalytic tool it can be, and what results he may expect from its application in 
specific instances. 
d. In the two foregoing examples of the application of the principle, the plain component 
was a normal sequence but it should be clear to the student, if he has grasped what has been said 
in the preceding subparagraph, that this component may be a mixed sequence which, if known 
(that is, if the sequence of letters comprising the sequence is known to the cryptanalyst), can be 
handled just as readily as can a plain component that is a normal sequence. 
. 
e. It i~ e~tirely immaterial at what points the plain and the ciph~r components are juxtaposed 


m the preliminary step of converting the cipher letters into their plain-component equivalents. 
For example, in the case of the reversed alphabet cipher solved in Par. 20b, the two components 
were arbitrarily juxtaposed to give the value A=A, but they might have been juxtaposed at any 
of the other 25 possible points of coincidence without in any way affecting the final result, viz, the 
production of one plain-text generatrix in the completion diagram. 


22. Value of mechanical solution as a short cut.--a. It is obvious that the very first step 
the student should take in his attempts to solve an unknown cryptogram that is obviously a 
substitution cipher is to try the mechanical method of solution by completing the plain-component 
sequence, using the normal alphabet, first direct, then reversed. This takes only a very few 
minutes and is conclusive in its results. It saves the labor and trouble of c~nstructing a frequency 
distribution in case the cipher is of this simple type. Later on it will be seen how certain varia- 
tions of this simple type may also be solved by the application of this method. Thus, a very 
easy short cut to solution is afforded, which even the experienced cryptanalyst never overlooks 
in his first attack on an unknown cipher. 
b. It is important now to note that if neither of the two foregoing attempts is B'UCceseful in 
bringing plain text to light find the cryptogram is quite obmously monoalphabetic in character, th6 
cryptanalyst is warranted in assuming that the cryptogram involves a mixed "cipher alphabet. a The 
steps to be taken in attacking a cipher of the latter type will be discussed in the next section. 


6 There is but one other pOl!Sibility, already referred to under Par. 17d, which involves the case where trans- 


position and monoalphabetic substitution processes have been applied in successive st.ape. 
This is unusual 
however, and will be discuSBed in its proper place. 
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SECTION VI 


UNILITERAL SUBSTITUTION WITH MIXED CIPHER ALPHABETS 


Paracraph 
Basic reason for the low degree of cryptographic security afforded by monoalphabetic cryptograms involving 
standard cipher alphabets ______ ,,_________________________________________________________________________________________________ 
23 
Preliminary steps in the analysis of a monoalphabetie, mixed-alphabet cryptogram_________________________________ 
24 


Further data concerning normal plain text-----------------·-------------------------------·--·--------------------------- 
26 
Preparation of the work sheet..---------------------------------------------------------------------------------------------------- 
26 
Triliteral-frequency c;listributions---------------------------------------------------------------·-------·------------------------- 
27 
Classifying the cipher letters into vowels and consonants------------------------------------------------------------------- 
28 
Further analysis of the letters representing vowels and consonants.----------------------------------------------------- 
29 


Substituting deduced values in the cryptogr&m------------------------------------------------------------------------------- 
30 
Completing the solution-------·--------------------------------------------------------------------------------------------------- 
31 


General remarks on the foregoing solution..--------------------------------------------------------------------------------- 
32 
The "p.1'.0bable-word" 111ethod; its value and applicabilitY------------------------------------------------------------------ 
33 


Solution of additional cryptograms produced by the same cipher component----------------·----------·-----------·- 
34 
23. :&,sic ~euo~ for the low degree of cryptographic security a.1forded by monoa.1.phabetic 


crypt~pania involring •ta.ndard cipher a.1.phabets.-The student has seen :that the solution. of 
monoa.lphabetic crypto~ams involving s.ta.ndard cipher alphabets is a very easy matter. Two 
methods of analysis were described, one involving the conatruction of a frequency distribution, 
the other not requiring this kind of tabulation, being. almost mechanical in nature and corre- 
spondingly rapid. In the first of these two methods it was necessary to make a correct assumptioB 
p,s to the. value of but one of the 26 letteni of the cipher alpha, bet and the v:ii.lues of the re:i;naining 
25 letters at once become known; in the second method it was not necessary to assume a value 
for even a single cipher letter. The student ahould understand what constitutes the basis of this 
situation, viz, the fact that the two components of the cipher alphabet are composed of known 
sequences. What if one or both of these components are, for the cryptanalyst, 'Unknown sequences? 
In other words, what difficulties will confront the cryptanalyst. if the cipher component of the 
cipher alphabet is a mixed sequence? Will such an alphabet be solvable as a whole at one stroke, 
or will it be necessary to solve its values individually? Since the determination of the value of 
one cipher letter in this case gives no direct clues to the value of any other letter, it would seem 
that the solution of such a cipher ahould involve considerably more analysis and experiment than 
has the solution of either of the two types of ciphers so far examined occasioned. A typical 
example will be studied. 


24. Preliminary steps in the analysis of a monoalphabetic, mixed alphabet cryptogram.- 
a. Note the following cryptogram: 


SFDZF IOGHL PZFGZ DYSPF HBZDS GVHTF UPLVD FGYVJ VFVHT GADZZ AITYD 
ZYFZJ ZTGPT VTZBD VFHTZ DFXSB GIDZY VTXOI YVTEF VMGZZ THLLV XZDFM 
HTZAI TYDZY BDVFH TZDFK ZDZZJ SXISG ZYGAV FSLGZ DTl-U-IT CDZRS VTYZD 
OZFFH TZAIT YDZYG AVDGZ ZTKHI TYZYS DZGHU ZFZTG UPGDI XWGHX ASR~ 
DFUID EGHTV EAGXX 


b. A casual inspection of the text discloses the presence of several long repetitions as well as 


of many letters of normally low frequency, such as F, G, V, X, and Z; on the other hand, letters of 


(40) 
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normally high frequency, such as the ·vowels, and the consonants N and R, are relatively scarce. 
The cryptogram is obviously a substitution cipher and the usual mechanical tests for determining 
whether it is possibly of the monoalphabetic, standard-alphabet type are applied. The results 
being negative, a uniliteral frequency distribution is immediately constructed and is as shown 
in Figure 13. 


FiGUH 18 


c. The fact that the frequency distribution shows very marked crests and troughs means 
that the cryptogram is undoubtedly monoalphabetic; the fact that it has already been tested 
(by the method of completing the plain-component sequence) and found not to be of the mono- 
alphabetic, standard-alphabet type, indicates with a high degree of probability that it involves 
a mixed cipher alphabet. A few moments might be devoted to making a careful inspection of the 
distribution to insure that it cannot be made to fit the normal; the object of this would be to 
rule out the possibility that the text resulting from substitution by a standard cipher alphabet 


h~d not subsequently been transp~~ed. But this inspection in this case is hardly necessary, in 
view of the presence of long repet1t10ns in the message.1 (See Par. 13g.) 
d. One might, of course, attempt to solve the cryptogram by applying the simple principles 
of frequency. One might, in other words, assume that Z0 (the letter of greatest frequency) 
represents E," De (the letter of next greatest frequency) represents Tp, and so on. If the message 
were long enough this simple procedure might more or less quickly give the solution. But the 
message is relatively short and many difficulties would be encountered. Much time and effort 
would be expended unnecessarily, because it is hardly to be expected that in a message of only 
235 letters the relative order of frequency of the various cipher letters should exactly coincide 
with, or even closely approximate the relative order of frequency of letters of normal plain text 
found in a count of 50,000 letters. It is to be emphasized that the beginner must repress the naiural 
tendency to place too much confidence in the generalized principles of frequency and to rely too much 
upon them. It is far better to bring into effective use certain other data concerning normal 
plain text which thus far have not been brought to notice. 
25. Further data concerning normal plain text,-a. Just as the individual letters constituting 


a large volume of plain text have more or less characteristic or fixed frequencies, so it is found 
that digraphs and trigraphs have characteristic frequencies, when a large volume of text is 
studied statistically. In Appendix 1, Table 6, are shown the relative frequencies of all digraphs 
appearing in the 260 telegrams referred to in Paragraph 9e. It will be noted that 428 of the 676 
possible pairs of letters occur in these telegrams, but whereas many of them occur but once or 
twice, there are a few which occur hundreds of times. 
b. In Appendix 1 will also be found several other kinds of tables and lists which will be useful 
to the student in his work, such as the relative order of frequency of the ·50 digraphs of greatest 


1.This possible step is mentioned here for the purpose of making it clear that the plain-component sequence 
completion method cannot solve a case in which transposition has followed or preceded monoalphabetic substi- 
tution with standard alphabets. Cases of this kind will be discussed in a later text. It is sufficient. to indicate 
at this point that the frequency distribution for such a combined substitution-transposition cipher would present 
the characteristics of a standard alphabet cipher-and yet the method of completing the plain-component 
sequence would fail to bring out any plain text. 
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frequency, the relative order of frequency of doubled letters, doubled vowels, doubled consonants, 
and so on. It is suggested that the student refer to this appendix now, to gain an idea of the 
data available for his future reference. Just how these data. may be employed will become ap- 
parent very shortly. 
26. Preparation of the work sheet.-a. The details to be considered in this paragraph may 
at first appear to be superfluous but long experience has proved that systematization of the 
work, and preparation of the data in the most utilizable, condensed form is most advisable, even 
if this seems to take considerable time. In the first place if it merely serves to avoid interrup- 
tions and irritations occasioned by failure to have the data in an instantly available form., it 
will pay by saving mental wear and tear. In the second place, especially in the case of com- 
plicated cryptograms, painstaking care in these details, while it may not always bring about 
success, is often the factor that is of greatest assistance in ultimate solution. The detailed 
preparation of the data may be irksome to the student, and he may be tempted to avoid as much 
of it as possible, but, unfortunately, in the early stages of solving a cryptogram he does not know 
(nor, for that matter, does the expert always know) just which data are essential and which 
may be neglected. Even though not all of the data may turn out to have been necessary, as a 
general rule, time is saved in the end if all the usual data are prepared as a regular preliminary 
to the solution of most cryptOgrams. · 


b. First, the c:ryptogram is reC<?pied in the form. of a work sheet. Thie sheet should be of a 


good quality of paper so as to withstand considerable erasure. If the cryptogram is to be 
copied by hand, cross-section paper of ~-inch squares 'is extremely useful. The writing should 
be in ink, and plain, carefully made roman capital letters should be u8ed in all cases. If the 
cryptogram is to be copied on a typewriter, the ribbon employed should be impregnated with an 
ink that will not smear or smudge under the hatid. 
· 
· 


c. The arrangement of the characters of the cryptogram on the work sheet is a matter of 


considerable importance. If the cryptogram as first obtained is in groups of regular length 
(usually five characters to a group) and if the uniliteral frequency distribution shows the crypto- 
gram to be monoalphabetic, the characters should be copied without regard to this grouping. 
It is advisa.ble to allow two spaces between letters, and to write a constant number of letters 
per line, approximately 25. At least two spaces, preferably three spaces, should be left between 
horizontal lines. Care should be taken to avoid crowding the letters in any case, for this is 
not only confusing to the eye but also mentally irritating when later it is found that not enough 
space has been left for making various sorts of marks or indications. If the cryptogram is origi- 
nally in what appears to be word lengths (and this is the case, as a rule, only with the cryptograms 
of amateurs), naturally it should be copied on the work sheet in the original groupings. If 
further study of a cryptogram shows that some special grouping is required, it is often best to 
recopy it on a fresh work sheet rather than to attempt to indicate the new grouping on the old 
work sheet. 
d. In order to be able to locate or refer to specific letters or groups of letters with speed, 
certainty, and without possibility of confusion, it is advisable to use coordinates applied to the 
lines and columns of the text as it appears on the work sheet. To minimize possibility of con- 
fusion, it is best to apply letters to the horizontal lines of the text, numbers to the vertical columns. 
In referring to a letter the horizontal line in which the letter is located is usually given first. Thus, 
referring to the work sheet shown below, coordinates A17 designate the letter Y, the 17th letter 
in the first line. The letter I is usually omitted from the series of line indicators so as to avoid 
confusion with the figure 1. If lines are limited to 25 letters each, then each set of 100 letters of 
the text is automatically blocked off by remembering that 4 lines constitute 100 letters. 
e. Above each character of the cipher text may be some indication of the frequency of that 
character in the whole cryptogram. This indication may be the actual number of times the 
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character occurs, or, if colored pencils are used, the cipher letters may be divided up into three 
categories or groups-high frequency, medium frequency, and low frequency. It is perhaps 
simpler, if clerical help is available, to indicate the actual frequencies. This saves constant 
reference to the frequency tables, which interrupts the train of thought, and saves considerable 
time in the end. 
f. After the speci8.J. frequency distribution, explained in Par. 27 below, has been constructed, 
repetitions of digraphs and trigraphs should be underscored. In so doing, the student should be 
particularly watchful of trigraphic repetitions which can be further extended into tetra.graphs 
and polygfaphs of greater length. Repetitions of more than ten characters should be set off by 
heavy vertical lines, as they indicate repeated phrases and are of considerable assistance in 
solution. If a repetition continues from one line to the next, put an arrow at the end of the 
underscore to signal this fact. 
Reversible digraphs should also be indicated by e.n underscore 


with an arrow pointing in both directions. 
Anything which strikes the eye as being peculiar, 
unusual, or significant as regards the distribution or recurrence of the characters should be 
noted. All these marks should, if convenient, be made with ink so as not to cause smudging. 
The work sheet will now appear as shown herewith (not all the repetitions are underscored): 


1 2 8 
4 
II 
8 
7 8 
0 10 11 12 13 14 
111 
111 17 18 10 ~ 21 22 23 24 211 


ro w 23 u w ro a w u 
11 
11 u w w M 23 u ro 
11 w u 4 u 23 ro 
A. S F D Z F I 0 G H L P Z F G Z D Y S P F H B Z D S 
.....__... 
- 


W W U 22 W II 
II 
II 
M 23 W W 14 M 3 M W M U 22 W 8 23 U M 
B GVHTFUPLVDFGYVJVFVHTGADZZ 


s ro 22 u 23 u u w u a u 22 w • 22 w 22 M ' 
23 w w u 22 u 
C A I T Y D Z Y F Z J Z T G P T V T Z B D V F H T Z 
+---+ 


23 w 8 ro 4 w ro 23 M 14 M 22 8 
3 ro 14 M 22 3 w M 2 w M M 


D DFXSBGIDZYVTXOIYVTEFVMGZZ 


22 U 
II 
II M 8 U 23 W 2 U 22 M 8 
~ 22 U 23 M U 
4 23 M W U 


E T H L L V X ~ 
M H T Z A I T Y D Z Y B D V F H 


22 M 23 w 2 M 23 M M 3 ro 8 ro ro w M 14 w 8 M w ro 
II w M 


F ~T ~ 
K Z D Z Z J S X I S G Z Y G A V F S L G Z 


23 22 u u 22 1 23 u 2 ro M 22 14 M 23 3 u w w u 22 M 8 ro 22 


G D TH HTC DZ RSV TY Z D 0 Z FF HT Z A IT• 


14 23 U 14 W 8 M 23 W M M 22 2 U ro 22 14 M 14 W 23 M W U 
11 


H ~y D Z Y G A V D G Z Z T K H I T Y Z Y S D Z G H U 


M W M 22 W II 
II W 23 ro 8 
1 W U 8 8 W 2 
11 
M 23 W II W 23 


J 
ZFZTGUPGDIXWGHXASRUZDFUID 


3 W U 22 M 3 8 W 8 8 
K EGHTVEAGXX 


27. Trilitera.1-fr\quency distributions.-a. In what has gone before, a type of frequency 
distribution known as a uniliteral frequency distribution was used. This, of course, shows only 
the number of times each individual letter occurs. In order to apply the norm.al digraphic and 
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trigraphfo frequency data (given in Appendix 1) to the solution of a cryptogram of the type now 
being studied, it is obvious that the data with respect to digraphs and trigraphs occurring in the 
cryptogram should be compiled and should be compared with the data for normal plain text. In 
order to accomplish this in suitable manner, it is advisable to construct a slightly more com- 
plicated form of distribution termed a triliteral frequency di&trilrution. 2 


b. Given a cryptogram of 50 or more letters and the task of determining what trigraphs are 


present in the cryptogram, there are three ways in which the data may be arranged or assembled. 
One may require that the data show (1) each letter with its two succeeding letters; (2) each letter 
with its two preceding letters; (3) each letter with one preceding letter and one succeeding letter. 


c. A distribution of the first of the three foregoing types may be designated as a "triliteral 


frequency distribution showing two suffixes"; the second type may be designated as a "tri .. 
literal frequency distribution showing two prefixes"; the third type may be designated as 
a "triliteral frequency distribution showing one prefix and one suffix." Quadriliteral and 
pentaliteral frequency distributions may occasionally be found useful. 
d. Which of these three arrangements is to be employed at a specific time depends largely 


upon what the data are intended to show. For present purposes, in connection with the solution 
of a monoalphabetic .. su.bstj.tution cipher employing a mixed alphabet, possibly.the third arrange- 
ment, that showllig one prefix and one suffix, is most satisfactory. 
e. It is convenient to use ~-:inch cross~etion paper for the construction of a triliteral fre- 


quency distribution in the form oh~ di&tribution showing crests and troughs, such as that in 
Figure 14. In .that figure the prefix to each letter to be recorded is inserted in the left half of the 
cell directly above the cipher letter befug recorded; the suffix to each letter is inserted in the right 
half of the cell directiy above the letter beuig recorded; and in each case the prefix and the 
suffix to the letter being recorded occupy the same cell, the prefix· being directly to the left of the 
suffix. The number in parentheses gives the total frequency for each letter. 


• Heretofore such a distribution has been termed a "trigraphic frequency table." It is thought that the word 


"triliteral" is more stiitable, t.o correspond with the designation "unlliteral" in the case of the distribution of the 
single letters. A trigraphic distribution of A B C D E F would consider only the trigraphs A B C and D E F, 
whereas a triliteral distribution would consider the groups A B C, B C D, C D E, and D E F. 
(See also Par. lld.) 
The use of the word "distribution" t.o replace the word "table" has already been explained. 
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J. The trilitetal frequency distribution is now to be examined with a view to ascertaining 
what .. digraphs and tri.graphs occur two or more times in the cryptogram. Consider the pair 
of columns containing the prefixes and suffixes to De in the distribution, as shown in Fig. 14. 
This pair of columns shows that the following digraphs appear in the cryptogram: 


Digraphs baaed on prefo;ea (arranged 
Digraphs baaed on aujfi:&ea (arranged 


as one reads up the column) 
aa one reads up the column) 
rn,zo,zo,m.m,m,oo, 
m,m,oo,~,m,m,oo, 


ZD, ID, ZD, ID, BD, ZD, ZD, 
DF, DZ, DF, DZ, DV, DF, DZ, 
ZD, CD, ZD, ID, VD, SD, GD, 
DT, DZ, DO, DZ, DG, DZ, DI, 


ZD. ID 
DF, DE 
The nature of the triliteral frequency distribution is such that in finding what digraphs are 
present in the cryptogram it is immaterial. whether the prefixes or the suffixes to the cipher 
letters are studied, 80 long as one is consistent in the study. For example, in the foregoing list of 
digraphs based on the prefixes to D0 , the digraphs FD, ZD, ZD, m, etc., are found; if now, the 
studep.t will refer to the suffixes of F 01 Z0 , V 01 etc., he will find the very same digraphs indicated. 
This being the case, the question may be raised as to what value there is in listing both the· 
pre~ and the suffixes to the cipher letters. The answer is that by so doing the trigraphs are 
indicated at the same time. For example, in the case of D0 , the following trigraphs are indicated: 


FDZ, ZDY, ZDS, VDF, ADZ, YDZ, BDV, ZDF, IDZ, ZDF, YDZ, BDV, ZDF, 


~z. ZDT. CDZ, ZDO, mz, VDG, SDZ, GDI, ZDF, IDE. 


: .;, The repealed digraphs and tri.graphs can now be found quite readily. Thus, in the case 
of 0 0, ,examining the list of digraphs based on suffixes, the following r~petitions are noted: 
DZ appears 9 times 
DF appears 5 times 
DV appears 2 times 


Examining the trigraphs with De as central letter, the following repetitions are noted: 
ZDF appears 4 times 
mz appears 3 times 
BDV appears 2 times 


h, It is unnecessary, of course, to go through the detailed procedure set forth in the pre- 
ceding subparagraphs in order to find all the repeated digraphs and trigraphs. The repeated 
trigraphs with D0 as central letter can be found merely from an inspection of the prefixes and 
suffixes opposite D0 in the distribution. It is necessary only to find those cases in which two or 
more prefixes are identical at the same time that the suffixes are identical. For example, the 
distribution shows at once that in four cases the prefix to D0 is Ze at the same time that the 
suffix to this letter is F 0 • 
Hence, the trigraph ZDF appears four times; The repeated trigraphs 
may all be found in this manner. 
i. The most frequently repeated digraphs and trigraphs are then assembled in what is 
termed a condensed table of repetitions, so as to bring this information prominently before the eye. 
As a rule, digraphs which occur less than four or five times, and trigraphs which occur less than 
three or four times may be omitted from the condensed table as being relatively of no importance 
in the study of repetitions. In the condensed table the frequencies of the individual letters 
forming the most important digraphs, trigraphs, etc., should be indicated. 
28. Classifying the cipher letters into vowels and consonants.-a. Before proceeding to a 


detailed analysis of the repeated digraphs and trigraphs, a vecy important step can be taken which 
will be of assistance not only in the analysis of the repetitions but also in the final solution of 
the cryptogram. This step concerns the classification of the high-frequency letters into two 


1u11111u11111u11111u11111u11111u11111u11111u11111u11111u11111u11111UlllllUlllllUlllllUlllllUlllllUlllllUlllllUlllllUlllllUlllllUlllllUrll 
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groups-Yowels and consonants. For if the cryptanalyst can quickly ascertain the equivalents 
of the four vowels, A, E, I, and O, and of only the four consonants, N, R, S, and T, he will then 
have the values of approximately two-thirds of all the cipher letters that occur in the cryptogram; 
the values of the remaining letters can almost be filled in automatically. 
b. The basis for the classification will be found to rest upon a comparatively simple phe- 
nomenon: the associational or combinatory behavior of vowels is, in general, quite different 
from that of consonants. If an examination be made of Table 7-B in Appendix 1, showing the 
relative order of frequency of the 18 digraphs composing 25 percent of English telegraphic text, 
it will be seen that the letter E enters into the composition of 9 of the 18 digraphs; that is, in 
exactly half of all the cases the letter E is one of the two letters forming the digraph. The 
digraphs containing E are as follows: 


ED 
EN 
ER 
ES 


NE 
RE 
SE 
TE 
VE 
The remaining nine digraphs are as follows: 


AN 
ND 
OR 
ST 


IN 
NT 
TH 


ON 
TO 


c. None of the 18 digraph8 is a combination of VO'l.Ods. 
Note now that of the 9 combinations 
with E, 7 are with the consonants N, R, S, and T, one is with D, one is with V, and none is with any 
vowel. In other words, EP combines most readily with consonants but not with other vowels, or 
even with itself. Using the terms often employed in the chemical analogy, E shows a great 
"a.ffinity" for the consonants N, R, S, T, but not for the vowels. Therefore, if the letters of highest 
frequency occurring in a given cryptogram are listed, together with the number of times each of 
them combines with the cipher equivalent of Ep, those which show considerable combining power 
or affinity for the cipher equivalent of Ep may be assumed to be the cipher equivalents of N, R, S, 
Tp; those which do not show any affinity for the cipher equivalent of EP may be assumed to be the 
cipher equivalents of A, I, 0, Up. 
Applying these principles to the problem in hand, and examin- 
ing the triliteral frequency distribution, it is quite certain that Zc=E.,, not only because Z0 is the 
letter of highest frequency, but also because it combines with 8evera/, other high-frequency letters, 
such as 0 0, F 0 , Ge, etc. The nine letters of next highest frequency are: 
· 


~ 
~ w w 
M 
u 
u 
ro 
ro 


D T F G V H Y S I 
Let the combinations these letters form with Z0 be indicated in the following manner: 


Number of times Z0 occurs as prefix.._ = 
~ 
~ 
~ 


Cipher Letter __________________________________ D(23) T(22) F(l9) G(19) V(l6) H(15) Y(14) S(lO) I(lO) 


~ 
~ 
::::: 
~ 
::::: 
Number of times Z0 occurs as suffix.._ ::::: 
::::: 


d. Consider D0• It occurs 23 times in the message and 18 of those times it is combined with 


Z01 9 times in the form ZcD0 (_:._E0p), and 9 times in the form DoZo (=0Ep). It is clear that D0 
must be a consonant. In the same way, consider T 0 1 which shows 9 combinations with Z0 , 4 in the 
form Z0Te (=E0p) and 5 in the form T0Z0 (=0Ep). The letter T0 appears to represent a consonant, 
as do also the letters F 01 Ge, and Y 0. On the other hand, consider Vo, occurring in all 16 times but 
never in combination with Z0 ; it appears to represent a vowel, as do also the letters H0 , S0 , and I 0 • 
So far, then, the following classification would seem logical: 


Vowels 
Ze(=Ep), Ve1 H0 , S 0 , I 0 


Consonants 
o., Tc, Fo, G., Yo 


I 
j! 
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29. l'urther analysis of•ihe letters representi.ng v&wels and consonants.~. Op is usuelly 
the vowel of second highest frequency. Is it possible to determine which of the lettersV, H, S, ! 0 
is the cipher equivalent of Op?· Let reference be made again to Table 6 in Appendix 1, where it 
is seen that the 10 most frequently occurring diphthongs a.re: 
· 


Diphthong _________ IO 
OU 
EA 
EI AI 
IE 
AU. 
EO 
AY 
UE 
Frequency ________ Al 
37 
35 
27 
17 
13 
13 
12 
12 
11 
If V, H, S, I, are really the cipher equivalents of A, I, 0, Up (not respectively), perhaps it is possible 
to determine which is which by examining the combinatiom tkey make a1M'fl,{J themsel.1J68 and 'IUiih. 
z. (=Ep). , Let the combinations of V, H, S, I, and Z that occur in the message be listed. There 
are o~y the following: 


ZZo---4 
VH_:2 
HH-1 


HI-1 
SV-1 
IS-1 


ZZ0 is of course EEp. 
Note the doublet HH.; if H0 is a vowel, then the chances 8.re excellent that 


H0=0p because the doublets AAP, II;, UUp, are practically non-existent, whereas the double vowel 
combination OOp is of next highest frequency to the double vowel combination EEP. If H0=0p, 
then V 0 must be Ip because the digraph VHc occurring two times in the message could hardly be 
A00, or UO_p, wheteRS the diphthong !Op is the one of high frequency in English. So far then, the 
tentative (because so far unverified) results of the analysis are as follows: 


Z0=Ep 
H0=0p 
V,=!p 


Thia leaves only two letters, ! 0 and S0 (already ~lassified as vowel&) to be separated into AP aod 
Op. 
Note the digraphs: 


Only two alternatives are open: 
(1) Either I 0=A11 and S.=Up, 
(2) Or 
I 0=Up and So=Ap. 


If the first alternative is selected, then 


If the second alternative is selected, then 


HI 0=00p 
SV0=0Ip 
IS,=00, 


HI 0=0Ap 
SV0=Uip 
IS0=AUp 


HI 0=0Up 
SV0=Aip 
IS 0=UAp 


The eye finds it difficult to choose between these alternatives; but suppose the frequency values of 
the plain~text diphthongs as given in Table 6 of Appendix 1 are added for each of these alternatives, 
giving the following: 


Hio=OAp, frequency value= 7 
SV0=U!p, frequency value= 5 
IS 0=AUp, frequency value= 13 


Total_____________ 25 


HI 0=0Up, frequency value=37 
SVc=Aip, frequency value=17 
ISo=UAp1 frequency value= 5 


Total_____________ 59 


Mathematicelly, the second alternative is more than twice as probable as the first. Let it be 
assumed to be correct and the following (still tentative) values are now at hand: 


Zo=Ep 
Ho= Op 
Ve= Ip 
So=Ap 
Io= Up 


b. Attention is now directed to the letters classified as consonants. How far is it possible 
to ascertain their values? The letter 0 01 from considerations of frequency alone, would seem 
to be TP, but its frequency, 23, is not considerably greater than that for T0 • It is not much 
greater than that for Fe or Go, with a frequency of 19 each. But perhaps it is possible to ascer- 
tain not the value of one letter alone but of two letters at one stroke. To do this one may make 


us~ ?f a tetra.graph o~ considerable importance in English, viz, TIONP. 
For if the analysis per- 


tammg to the vowels is correct, and if VH.=IOp, then an examination of the letters immediately 
before and after the digraph VH. in the cipher text might disclose both TP and N. Reference 
to the text gives the following: 
P 
GVHT0 
FVHT0 


eroep 
eroep 


The letter To follows VHo in both cases and very probably indicates that T0=NP; but as to whether 
Go or Fe equals TP cannot be decided. However, two conclusions are clear: first the letter D 
is neither Tp nor Np, from which it follows that it must be either R11 or Sp; seco~d, the letter: 
Ge and F, must be either Tp and Sp, respectively, or Sp and Tp, respectively, because the only 
tetra.graphs usually found (in English) containing the diphthong IOp as central letters are SION 
and TIONp. This in turn means that as regards 0 01 the latter cannot. be either R or S · it mWJ; 
be Rp, a conclusion which is corroborated by the fact that ZD0 (=ER.ii) and DZc ( RE:i,)' occur 9 
times each. Thus far, then, the identifications, when inserted in an enciphering alphabet, are 
as follows: 
· 
Plain _______________ A B CDEFGHIJKLMNOPQRSTUVWXYZ 
Cipher ____________ $ 
Z 
V 
TH 
DGFI 
F G 


30. Substituting deduced values in the oryptogram.-a. Thus far the analysis has been 
almost purely hypothetical, for as yet not a single one of the values deduced from the foregoing 
analysis has been tried out in the cryptogram. It is high time that this be done, because the 
final test of the validity of the hypotheses, assumptions, and identifications made in any crypto- 
graphic study is, after ell, only this: do these hypotheses, assumptions, and identifications 
ultimately yield verifiable, intelligible plain-text when consistently applied to the cipher text? 
b. At the present stage in the process, since there are at hand the assumed values of but 9 
out of the 25 letters that appear, it is obvious that a continuous "reading" of the ·cryptogram 
can certainly not be expected from a mere insertion of the values of the 9 letters. However the 
substitution of these values should do two things. First, it should immediately disclos: the 
fragments, outlines, or "skeletons" of "good" words in the text; and second, it should disclose 
no places in the text where "impossible" sequences of letters are established. . By the first is 
meant that the partially deciphered text should show the outlines or skeletons of words such 
as may be expected to be found in the communication; this will become quite clear in the next 
subparagraph. By the second is meant that sequences, such as "AOOEN" or "TNRSENO" or the 
like, obviously not possible or extremely unusual in normal English text, must not result from 
the substitution of the tentative identifications resulting from the analysis. The appearance 
of several such extremely unusual or impossible sequences at once signifies that one or JllOre of 
the assumed values is incorrect. 


i 


I 


I' 
' 


,, 


,, 
: 


I 


: 


~I 


c. Here are the results of substituting the nine values which have been deduced by the 
reasoning based on a classification of the high-frequency letters into vowels and consonants 
and the study of the members of the two groups: 


1 
2 a • 6 e 7 s 9 ro u a n u u u u ~ w ~ m n • M ~ 


ro w ~ u w ro a w w 6 
6 u w w u • u ro 6 w w ' 
M ~ ro 


A SFDZFIOGHLPZFGZDYSPFHBZDS 


ATRET 
SO 
ETSER 
A 
TO 
ERA 


S 
S 
T 
ST 
S 


u u u 22 w a a 6 u • u u u u a u u u u n u s • u M 


B GVHTFUPLVDFGYVJVFVHTGADZZ 
S I 0 N T 
I R T S 
I 
I T I 0 N S 
R E E 
T 
S 
ST 
S 
T 


a ro 22 u • u u u M a u 22 u a 22 u 22 ~ • 
• u u w 22 M 
C AITYDZYFZJZTGPTVTZBDVFHTZ 


N 
RE 
TE 
ENS 
NINE 
RITONE 
S 
T 
S 


• u a w • u ro • u u u n s a ro u u n a u u 2 w ~ M 
D DFXSBGIDZYVTXOIYVTEFVMGZZ 


R T 
A 
S 
R E 
I N 
I N 
T I 
S E E 


S 
T 
S 
T 


n ~ a 6 u a M • u 2 u n u s ro n u m u u 4 • u u w 


E THLLVXZDFMHTZAITYDZYBDVFH 


N 0 
I 
E R T 
0 N E 
N 
R E 
R I T 0 
s 
s 


22 M ~ u 2 M ~ M M 3 ro 8 ro ro w M u u 8 u u ro 6 u M 


F TZDFKZDZZJSXISGZYGAVFSLGZ 


NERT 
EREE 
A 
ASE 
S 
ITA 
SE 
S 
T 
T 
S 
T 


~ n w ~ 22 1 • 
M 2 ro u 22 u M • 
3 M u u u 22 M 8 ro 22 


G DTHHTCDZRSVTYZDOZFFHTZAIT 
RNOON 
RE 
AIN 
ER 
ETTONE 
N 
s s 


u • M u w 8 u • u M M n 
2 u ro 22 u M u ro ~ M u u 6 


H 
YDZYGAVDGZZTKHITYZYSDZGHU 


RE 
S 
IRSEEN 
0 
N 
E 
ARESO 


T 
T 
T 


M u M 22 u a a u • ro s 1 u u s s ro 2 a M ~ u a ro • 


J 
ZFZTGUPGDIXWGHXASRUZDFUID 
ETENS 
SR 
SO 
A 
ERT 
R 


S 
T 
T 
T 
S 


a u u 22 u a s w a a 


K 
EGHTVEAGXX 
S 0 NI 
S 
T 
T 
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d. No impossible sequences are brought to light, and, moreover, several long words, nearly 
complete, stand out in the text. Note the following portions: 


A21 
H B Z D S G V H T:F 


(1) 
0 ? E R A S I 0 N T 
T 
S 


CUI TVTZBDVFHTZDF 


(2) 
N I N E ? R I T 0 N E R T 
s 
s 


F22 
SLGZDTHHT 


(3) 
A ? S E R N 0 0 N 
T 
The words are obviously OPERATIONS, NINE PRISONERS, and AFTERNOON. 
The value G. si 
clearly TP; that of F 0 is Sp; and the following additional values are certain: 


B0=Pp 
L0=Fp 


31. Completing the solution.-a. Each time an additional value is obtained, substitution 
is at once made throughout the cryptogram. This leads to the determination of further values, 
in an ever-widening circle, until all the identifications are firmly and finally established, and the 
message is completely solved. In this case the decipherment is as follows: 


1 
2 a • 
6 e 7 s 9 ro u u n u w u u ~ w ~ m 22 
~ M ~ 


A 
S F D Z F I 0 G H L P Z F G Z D Y S P F H B Z D S 
ASRESULTOFYESTERDAYSOPERA 


B 
G V H T F U P L V D F G Y V J V F V H T G A D Z Z 
T I 0 N S B Y F I R S T D I V I S I 0 N T H R E E 
AITYDZYFZJZTGPTVTZBDVFHTZ 
C HUNDREDSEVENTYNINEPRISONE 


D DFXSBGIDZYVTXOIYVTEFVMGZZ 


R S C A P T U R E D I N C L U D I N G S I X T E E 


E 
THLLVXZDFMHTZAITYDZYBDVFH 
N 0 F F I C E R S X 0 N E H U N D R E D P R I S 0 


F 
TZDFKZDZZJSXISGZYGAVFSLGZ 
NERSWEREEVACUATEDTHISAFTE 


G DTHHTCDZRSVTYZDOZFFHTZAIT 
RNOONQREMAINDERLESSONEHUN 


H YDZYGAVDGZZTKHITYZYSDZGHU 


DREDTHIRTEENWOUNDEDARETOB 


J 
ZFZTGUPGDIXWGHXASRUZDFUID 
E S E N T B Y T R U C K T 0 C H A M S E R S B U R 


K EGHTVEAGXX 
GTONIGHTXX 


J 
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, 
Message: AS RESULT OF YESTERDAYS OPERA'J,'IONS BY FIRST DIVISION THREE 
HUNDRED SEVENTY NINE PRISONERS CAPTURED INCLUDING SIXTEEN OFFICERS ONE 
HUNDRED PRISONERS WERE EVACUATED THIS AFTERNOON REMAINDER LESS ONE HUNDRED 
THIRTEEN WOUNDED ARE TO BE SENT BY TRUCK TO CHAMBERSBURG TONIGHT 


b. The solution should, as a rule, not be considered complete until an attempt has been 
made to discover all the elements underlying the general system and the specific key to a message. 
In this case there is no need to delve further into the general system, for it is merely one of 
monoalphabetic substitution with a mixed cipher alphabet. It is necessary or advisable, how- 
ever, to reconstruct the cipher alphabet because this may give clues that later may become 
valuable. 
c. Cipher alphabets should, as a rule, be reconstructed by the cryptanalyst in the form of 


enciphering alphabets because they will then usually be in -the form in which the encipherer 
used them. This is important for two reasons. First, if the sequence in the cipher component 
gives evidence of system in its construction or if it yields ekes pointing toward its derivation 
from a,· keyword or a key-phrase, this may often corroborate the identifications already made 
and may lead directly to additional identifications. A word or two of explanation is advisable 
here. For example, refer to the skeletonized enciphering alphabet given at the end of par. 29b: 


- Plain_____________ A- B C D E F G H I J K L M N 0 P Q R S T U V W X Y Z 
- -'Cipher ____ _-_~---- s 
z 
v 
T H 
D G F I 
F G 


Suppose the cryptanalyst, looking at the sequence DGFI or DFGI in the cipher component, sus- 
pects the presence of a keyword-mixed alphabet. Then DFGI is certainly a more plausible 
sequence than DGFI. Again, noting the sequence S . . . Z • • . V . • . • TH . . D, he might 
have an idea that the keyword begins after the Z and that the TH is followed by AB or BC. This 
would mean that -either P, Qp=A, B0 or B, Co. Assuming that P, Qp=A, B., he refers to the fre- 
quency distribution and finds that the assumptions PP=Ao and Qp=B0 are not good; on the other 
hand, assuming that P, QP=B, C0 , the frequency distribution gives excellent corroboration. 
A trial of these values would materially hasten solution because it is often the case in crypt- 
analysis that if the value of a very low-frequency letter can be surely established it will yield 
clues to other values very quickly. Thus, if QP is definitely identified it almost invariably will 
identify Up, and will give clues to the letter following the Up, since it must be a vowel. In the 
case under discussion the identification PQp=BC0 would have turned out to be correct. For the 
foregoing reason an attempt should always be made in the early stages of the analysis to deter- 
mine, if possible, the basis of construction or derivation of the cipher alphabet; as a rule this 
can be done only by means of the enciphering alphabet, and not the deciphering alphabet. For 
example, the skele_tonized deciphering alphabet corresponding to the enciphering alphabet 
directly above is as follows: 


Cipher ___________ A B C D E F G H I J K L M N 0 P Q R S T U V W X Y Z 
Plain______________ 
R 
T S 0 U 
A N 
I 
E 


ST 


Here no evidences of a keyword-mixed alphabet are seen at all. However, if the enciphering 
alphabet has been examined and shows no evidences of systematic construction, the deciphering 
alphabet should then be examined with this in view, because occasionally it is the deciphering 
alphabet which shows the presence of a key or keying element, or which has been systematically 
derived from a word or phrase. The second reason why it is important to try to discover the basis 


\, 
l 


.1 
I 


! 
/' 
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of construction or derivation of the cipher alphabet is that it a.ff ords clues to the general type of 
keywords or keying elements employed by the enemy. This is a psychological factor, of course, 
and may be of assistance in subsequent studies of his traffic. It merely gives a clue to the general 
type of thinking indulged in by certain of his cryptographers. 
d. In the case of the foregoing solution, the complete enciphering alphabet is found to be as 


follows: 


Plain______________ A B C D E F G H I J K L M N 0 P Q R S T U V W X Y Z 
Cipher___________ S U X Y Z L E A V N W 0 R T H B C D F G I J K M P 


Obviously, the letter Q, which is the only letter not appearing in the cryptogram, should follow 
Pin the cipher component. Note now that the latter is based upon the keyword LEAVENWORTH, 
and that this particular cipher alphabet has been composed by shifting the mixed sequence based 
upon this keyword five intervals to the right so that the key for the message is Ap=S0 • 
Note 


also that the deciphering alphabet fails to give any evidence of keyword construction based upon 
the word LEAVENWORTH. 


Cipher___________ A B C D E F G H I J K L M N 0 P Q R S T U V W X Y Z 
Plain______________ H P Q R G S T 0 U V W F X J L Y Z M A N B I K C D E 


e. If neither the enciphering or the deciphering alphabet exhibits characteristics which 
give indication of derivation from a keyword by some form of mixing or disarrangement, the 
latter is nevertheless not finally excluded as a possibility. The student is referred to Section IX 
of Elementary Military Cryptography, wherein will be found methods for deriving mixed alphabets 
by transposition methods applied to keyword-mixed alphabets. For the reconstruction of such 
mixed alphabets 'the cryptanalyst must use ingenuity and a knowledge of the more common 
methods of suppressing the appearance of keywords in the mixed alphabets. 
32. General notes on the foregoing solution.-a. The example solved above is admittedly 
a more or less artificial illustration of the steps in analysis, made so in order to demonstrate 
general principles. It was easy to solve because the frequencies of the various cipher letters cor- 
responded quite well with the normal or expected frequencies. However, all cryptograms of 
the same monoalphabetical nature can be solved along the same general lines, after more or less 
experimentation, depending upon the length of the cryptogram, the skill, and the experience of 
the cryptanalyst. 
b. It is no cause for discouragement if the student's initial attempts to solve a cryptogram of 
this type require much more time and effort than were apparently required in solving the fore- 
going purely illustrative example. It is indeed rarely the case that every assumption made by the 
cryptanalyst proves in the end to have been correct; more often is it the case that a good many 
of his initial assumptions are incorrect, and that he loses much time in casting out the erroneous 
ones. The speed and facility with which this elimination process is conducted is in many cases 
all that distinguishes the expert from the novice. 
c. Nor will the student always find that the initial classification into vowels and consonants 
can be accomplished as easily and quickly as was apparently the case in the illustrative example. 
The principles indicated are very general in their nature and applicability, and there are, in 
addition, some other principles that may be brought to bear in case of difficulty. Of these, per- 
haps the most useful are the following: 
(1) In normal English it is unusual.to find two or three consonants in succession, each of high 
frequency. If in a cryptogram a succession of three or four letters of high-frequency appear in 
succession, it is practically certain that at least one of these represents a vowel.3 


3 Sequences of seven consonants are not impossible, however, as in STRENGTH THROUGH. 
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(2) Successions of three vowels are rather unusual in English.' Practically the only time 
this happens is when a word ends in two vowels and the next word begins with a vowel.6 
(3) When two letters already classified as vowel-equivalents are separated by 11. sequence of 
six or more letters, it is either the case that one of the supposed vowel-equivalents is incorrect, 
or else that one or more of the intermediate letters is a vowel-equivalent.• 


(4) Reference to Table 7-B of Appendix 1 discloses the following: 


Diatn"bution of first 18 digraph.a forming S5 perCM&t of Engliah tezl 


Number of consonant-consonant digraphs..______________________________________________________________ 
4 


Number of consonant-vowel digraphs-------·---------------.:·------------------------------------------· 
6 
Number of vowel-consonant digraphs..-.--·-·----------------·······--·--··---------------- 
8 
Number of vowel-vowel digraphs---------------------------------------------·--------------------------• 
0 


Diatn"bution of jirtlt 58 digraphl forming 50 percent of Engliah t«d 


Number of consonant-consonant digraphs...---------------------------------------------------------------- 
8 
Number of consonant-vowel digraphs--------------------------------·-----------------------·- 
23 


Mumber of vowel-consonant digraphs •. ----------------------·----------------·------------------------------ 
18 
Number of vowel-vowel digraphs ______ --------------·---------------··-------~---------.:______________________ 
4 


The latter tabulation shows that of the first 53 digraphs which form 50 percent of English text, 
41 of them, that is, over 75 percent, are combinations of a vowel with a consonant. In short, 
in nomial English the vowels and the high .. frequency consonants are in the long run dis- 
tributed fairly evenly and regularly throughout the text. 
, 


(5) As a rule, repetitions of trigraphs in the cipher text are composed of high-frequency 


letters forming high-frequency combinations. The latter practically always contain at least one 
vowel; in fact, if reference is made to Table 10-A of Appendix 1, it will be noted that 36 of the 56 
trigraphs having a frequency of 100 or more contain one vowel, 17 of therii. contai.D. two vowels, 
and only three of them contain no vowel. In the case of tetragraph repetitions, Table. 11-A of 
Appendix 1 shows that no tetragraph listed therein fails to contain at least one vowel; 28 of them 
contain one vowel, 25 contain two vowels, and 2 contain three vowels. 
(6) Quite frequently when two known vowel-equivalents are separated by six or more letters 
none of which seems to be of sufficiently high frequency to represent one of the vowels A EI O, 
the chances are good that the cipher-equivalent of the vowel U or Y is present. 
(7) The letter Q is invariably followed by U; the letters J and V are invariably followed by a 


vowel. 
d. In the foregoing example· the amount of experimentation or "cutting and fitting" was 
practically nil. (This is not true of real cases as a rule.) Where such experimentation is neces- 


' Note that the word RADIOED, past tense of the verb RADIO, is coming into usage. 
1 A sequence of seven vowels is not impoBBible, however, as in THE WAY YOU EARN. 
• Some cryptanalysts place a good deal of emphasis upon this principle as a method of locating the remaining 
vowels after the first two or three have been located. They recommend that the latter be underlined throughout 
the text and then all sequences of five or more letters showing no underlines be studied attentively. Certain 
letters which occur in several such sequences are sure to be vowels. 
An arithmetical aid in the study is as follows: 
Take a lett.er thought to be a good poBBibility as the cipher equivalent of a vowel (hereafter termed a poaribZe 
uowel-equiualent) and find the length of each interval from the possible vowel-equivalent to the next known (fairly 
surely determined) vowel-equivalent. Multiply the interval by the number of times this interval is found. 
Add 
the products and divide by the total number of intervals considered. This will give the mean interval for that 
poBBible vowel-equivalent. Do the same for all the other poeeible vowel-equivalents. The one for which the 
mean is the greatest is most probably a vowel-equivalent. Underline this letter throughout the text and repeat 
the proceBB for locating additional vowel-equivalents, if any remain to be located. 
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sary, the underscoring of all repetitions of several letters is very essential, as it calls attentiOn to 
peculiarities of structure that often yield clues. 


e. After a few basic assumptions of values have been made, if short words or skeletons of 
words do not become manifest, it is necessary to make further assumptions for unidentified letters. 
This is accomplished most often by assuming a word. 7 Now there are two places in every message 
which lend themselves more readily to successful attack by the assumption of words than do 
any. other places-the very beginning .and the very end of the message. The reason is quite 
obvious, fo~ although words may begm or end with almost any letter of the alphabet, they 
usually begm and end with but a few very common digraphs and trigraphs. Very often the 
association of letters in peculiar combinations will enable the student to note where one word 
ends and the next begins. For example suppose, E, N, S, and T have been definitely identified, 
and a sequence like the following is found in a cryptogram: 
••• ENTSNE ••• 


Obviously the break between two words should fall either after the S of E N T S or after the T 
of ENT, so that two possibilities are offered: ... ENT S /NE ••• , or ... E NT / S NE 
. . .. Since in English there are very few words with the initial trigraph S N E, it is most· 
likely that the proper division is ... E NT S /NE . . .. Obviously, when several word 
divisions have been found, the solution is more readily achieved because of the greater ease with 
which assumptions of additional new values may be made. 
. SS. T~e "probable word" method; its value and applicability.-a. In practically all cryptan- 


alytic studies, short-cuts can often be made by assuming the presence of certain words in the 
message under study. Some writers attach so much value to this kind of an "attack from the 
rear" that they practically elevate it to the position of a method and call it the "intuitive method" 
or the "probable-word method." It is, of course, merely a refinement of what in every-day 
language is called "assuming" or "guessing" a word in the message. The value of making a 
"good guess" can hardly be overestimated, and the cryptanalyst should never feel that he is 
accomplishing a solution by an illegitimate subterfuge when he has made a fortunate guess 
leading to solution. A correct assumption as to plain text will often save hours or days of labor, 
and sometimes there is no alternative but to try to "guess a word", for occasionally a system is 
encountered the solution of which is absolutely dependent upon this artifice. 
b. The expression "good guess" is used advisedly. For it is "good" in two respects. First, 
the cryptanalyst must use care in making his assumptions as to plain-text words. In this he 
must be guided by extraneous circumstances leading to the assumption of probable words-not 
just any words that come to his mind. Therefore he roust use his imagination but he must 
nevertheless carefully control it by the exercise of good judgment. Second, only if the "guess" 
is correct and leads to solution, or at least puts him on the road to solution, is it a good guess. 
But, while realizing the usefulness and the time and labor-saving features of a solution by assum- 
ing a probable word, the cryptanalyst should exercise discretion in regard to how long he may 
continue in his efforts with this method. Sometimes he may actually waste time by adhering 
to the method too long, if straightforward, methodical analysis will yield results more quickly. 
c. Obviously, the "probable-word" method has much more applicability when working 
upon material the general nature of which is known, than when working upon more or less 
isolated communications exchanged between correspondents concerning whom or whose activities 


7 This procees does not involve anything more mysterious than ordinary, logical reasoning; there is nothing 
of the subnormal or supernormal about it. If cryptanalytic success seems to require processes akin to those of 
medieval magic, if "hocus-pocus" is much to the fore, the student should begin to look for items that the claimant 
of such success has carefully hidden from view, for the mystification of the uninitiated. 
(See Par. 33 in this 


co11nection.) 
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nothing is known. For in the latter case there is little or nothing that the imagination can seize 
upon as a background or basis for the assumptions.8 
d. Very frequently, the choice of probable words is aided or limited by the number and 
positions of repeated letters. These repetitions may be patent-that is, externally visible in 
the cryptographic text as it originally stands-or they may be latent-that is, externally invisible 
but susceptible of being made pa.tent as a result of the analysis. For example, in a monoalpha- 
betic substitution cipher, such as that discussed in the preceding paragraph, the repeated letters 
are directly exhibited in the cryptogram; later the student will encounter many cases in which 
the repetitions are latent, but are made patent by the analytical process. When the repetitions 
are patent, then the pattern or Jormida to which the repeated letters conform is of direct use 
in assuming plain-text words; and when the text is in word·lengths, the. pattern is obviously of 
even greater assistance. Suppose the cryptanalyst is dealing with military text, in which case 
he may expect such words a~ Dl;VI~ION, BATTALION, etc., to _be present in the text. The 
positions of the repeated letter I in DIVISION, of the reversible digraph AT, TA in BATTALION, 
and so on, constitute for the experienced cryptanalyst tell-tale indications of the presence of 
these words, even when the text is not divided up into its original word lengths. 


.e. The important aid that a study of word patterns can afford in cryptanalysis warrants the 
use of definiw terminology and the establishment of certain data having a bearing thereon. The 
phenomenon herein under discussion, namely, that many words are of such construction as 


regar~ tl,u~ l\'1Jlher and positions of repeated letters as to make them readily identifiable, will be 
term¢idiomorp4iam(froi:n the Greek ''idios''=one'sown,individual,peculiar+"morphe''=form). 
Words which show this phenome1:10n will be termed idiomorphie. It will be useful to deal with 
the idioniorphisms symbolically and systematically as. described below~ 


j. When dealing with cryptograms in which the word lengths are determined or specifically 
shown, it is convenient to indicate their lengths and their repeated letters in some easily recog- 
nized manner or by formulas. This is exemplified, in the case of the word DIVISION, by the 
formula ABCBDBEF; in the case of the word BATTALION, by the formula ABCCBDEFG. If the 
cryptanalyst, during the course of his studies, makes note of striking formulas he has encoun~ 


ter~, with the words which fit them, after some time he will have assembled a quite valuable 
body of data. And, after more or less complete lists of such formulas have been established.in 
some systematic arrap.gement, a rapid comparison of the idiomorphs in a specific cryptogram 
with those in his lists will be .feasible and will of ten lead to the assumption of the correct word. 
Such lists can be arranged according to word length, as shown herewith: 


3/aba 
abb 
4/abac 
ab ca 
abbc 
abcb 
etc. 


DID, EVE, EYE. 
ADD, ALL, ILL, OFF, etc. 
ARAB, AWAY, etc. 
AREA, BOMB, DEAD, etc. 


etc. 


8 General Givierge in his Cours de Cryptographie (p. 121) says: "However, expert cryptanalyets often 
employ such details as are cited above [in connection with assuming the presence of 'probable words'], and the 
experience of the years 1914 to 1918, to cite only those, prove that in practice one often has at his disposal ele- 
ments of this nature, permitting assumptions much more audacious than those which served for the analysis 
of the Jast example. The reader would therefore be wrong in imagining that such fortuitous elements are 
encountered only in cryptographic works where the author deciphers a document that he himself enciphered. 
Cryptographic correspondence, if it is extensive, and if sufficiently numerous working data are at hand, often 
furnishes elements so complete that· an author would not dare use all of them in solving a problem for fear of 
being accused of obvious exaggeration." 
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fi. When dealing with cryptographic text in which the lengths of the words are not indicated 


or otherwise determinable, lists of the foregoing nature a.re not so useful as lists in which the 
words (or parts of words) are arranged according to the intervals between identical letters in the 
following manner: 
' 


1 Interval 
2 Intervals 
3 Intervals 
ReQeated digraQhs 
-DiD- 
AbbAcy 
Abey Anoe 
cocoa 
-EvE- 
ArAbiA 
hAbitAble 
dERER 
-EyE- 
AbiAtive 
lAborAtory 
ICICle 
division 
AboArd 
AbreAst 
I NI Ng 
rev Is Ion 
-AciA- 
AbroAd 
bAGgAGe 
etc. 
etc. 
etc. 
etc. 


34. Solution of additional cryptograms produced by the same cipher component.-a. To 
return, after a ra~her long digression, to the cryptogram solved in pars. 28-31, once the cipher 
component of a cipher alphabet has been reconstructed, subsequent messages which have been 
enciphered by means of the same cipher component may be solved very readily, and without 
recourse to the principles of frequency, or application of the probable-word method. It has been 
seen that the illustrative cryptogram treated in paragraphs 24-31 was enciphered by juxtaposing 
the cipher component against the normal sequence so that Ap=S0 • It is obvious that the cipher 
component may be set against the plain component at any one of 26 different points of coinci- 
dence, each yielding a different cipher alphabet. .After a cipher component has been reconstructed, 
however, it becomes a known sequence, and the method of converting the cipher letters into their 
plain-component equivalents and then completing the plain-component sequence begun by 
each equivalent can be applied to solve any cryptogram which has been enciphered by that 
cipher component. 
b. An example will serve to make the process clear. Suppose the following message, pn.ssing 


between the same two stations as before, was intercepted shortly a.f ter the first message had 
been solved: 


I Y E W K 
CERN W 
0 F 0 S E 
LFOOH 
EAZXX 


It ia assumed that the same cipher component was used, but with a different key letter. First 
the initial two groups are converted into their plain-component equivalents by setting the 
cipher component against the normal sequence at any arbitrary point of coincidence. The 
initial letter of the former may as well be set against A of the latter, with the following result: 


Plain __ ----------- A B C D E F G H I J K L M N 0 P Q R S T U V W X Y Z 
Cipher___________ L E A V N W 0 R T H B C D F G I J K M P Q S U X Y Z 


Cryptogram____ I Y E W K 
Equivalents____ P Y B F R 
CERN W 
LB HE F 


The normal sequence initiated by each of these conversion equivalents is now completed, with 
the results shown in Fig. 15. Note the plain-text genera.trix, CLOSEYOURS, which manifests 
itself without further analysis. The rest of the message may be read either by continuing the 
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same process, or, what is even more simple, the key letter of the message may now be determined 
quite readily and the message deciphered by its means. 
IYEWKCERNW 
PYBFRLBHEF 
Q Z C G S M C I F G 
RADHTNDJGH 
S B E I U 0 E K H I 
T C F J V P F L I J 
UDGKWQGMJK 
VEHLXRHNKL 
WFIMYSIOLM 
XGJNZTJPMN 
YHKOAUKQNO 
ZILPBVLROP 
AJMQCWMSPQ 
BKNRDXNTQR 
*C L 0 S E Y 0 U R S 
DMPTFZPVST 
ENQUGAQWTU 
FORVHBRXUV 
G P S W I C S Y V W 
HQTXJDTZWX 
IRUYKEUAXY 
JSVZLFVBYZ 
KTWAMGWCZA 
LUXBNHXDAB 
M V Y C 0 I Y E B C 
NWZDPJZFCD 
OXAEQKAGDE 
c. In order that the student may understand without question just what is involved in the 
latter step, that is, discovering the key letter after the first two or three groups have been deci- 
phered by the conversion-completion process, the foregoing example will be used. It was noted 
that the first cipher group was finally deciphered as follows: 
Cipher___________ I Y E W K 
Plain_____________ C L 0 S E 


Now set the cipher component against the normal sequence so that Cp= I 0 • 
Thus: 


Plain_____________ A B C D E F G H I J K L M N 0 P Q R S T U V W X Y Z 
Cipher----------- F . G . I J K M P Q S U X Y Z L E A V N W 0 R T H B C D 
It is seen here that when Cp=I0 then Ap=F.. This is the key for the entire message. The 
decipherment may be completed by direct reference to the foregoing cipher alphabet. Thus: 


Cipher------------------ I Y E W K 
C E R N W 
0 F 0 S E 
L F 0 0 H 
E A Z X X 
Plain ____________________ C L 0 S E 
Y 0 U R S 
T A T I 0 
N A T T W 
0 P M X X 


Message: CLOSE YOUR STATION AT TWO PM 
d. The student should make sure that he understands the fundamental principles involved in 
this quick solution, for they are among the most important principles in cryptanalytics. How use- 
ful they are will become clear as he progresses into more and more complex cryptanalytic studies. 
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SECTION VII 


MULTILITERAL SUBSTITUTION WITH SINGLE-EQUIVALENT CIPHER ALPHABETS 
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36. Analysis of mulfiliteral, monoalphabetic substitution systems.-a. Substitution methods 


m general may be classified into uniliteral and multilitera.l systems.1 In the former there is a 


s~rict "one-to-on?" correspondence between the length of the units of the plain and those of the 
cipher text; th~t 18, each letter of the plain text is replaced by a single character in the cipher text. 
In t~e latter thlS correspondence is no longer lp:l 0 but may be lp:2 0, where each letter of the plain 
text 1~ re~lac~d by a ~om.bination of two characters in the cipher text; or lp:3c, where a 3-character 
co1?bmat1on m the cipher text represents a single letter of the plain text, and so on. A cipher in 
which the correspondence of the lp:l 0 type is termed uniliteral in character; one in which it is of 
the lp:2. type, biliteral; lp:31,, triliteral, and so on. Those beyond the lp:lc type are classed to- 
gether as multuiteral. 


. 
b •. When a multiliteral system employs biliteral equivalents, the cipher alphabet is said to be 


bipartite. Such alphabets are composed of a set of 25 or 26 combinations of a limited number of 
characters taken in pairs. An example of such an alphabet is the following. 


Plain ________________ A 
B 
C 
D 
E 
F 
G 
H 
I 
J 
K 
L 
M 


Cipher ______________ ww 
WH 
WI 
WT 
WE 
HW 
HH 
HI 
HT 
HT 
HE 
IW 
IH 
Plain ________________ N 
0 
P 
Q 
R 
S 
T 
U 
V 
W 
X 
y 
Z 
Cipher ______________ II IT IE 
TW 
TH 
TI 
TT 
TE 
EW 
EH 
EI 
ET 
EE 


This alphabet is derived from the square shown in Fig. 15. 


(2) 


W H I 
T E 
w 


H 


(1) I 


T 


E 


A 


F 


L 


~ 


Q 
v 


B c 
D 
E 


G 
H I-J K 


M 
N 
0 
p 


1- 


R s 
T u 
w x y z 


J'IOtJB& J.4. 


. c. If a messag~ is.enciphered by means of the fo~eg~ing .bipartite alphabet the cryptogram is 
still monoalphabetic m character. A frequency distnbution based upon pairs of letters will 


t See Sec. VII, Advanced Military Cryptography. 
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obvious:\y have all the characteristics of a simple, uniliteral distribution for a monoalphabetic 
substitution cipher. 
d. Ciphers of this type, as well as of those of the multiliteral (triliteral, quadraliteral, ... ) 
type are readily detected externally by virtue of the fact that the cryptographic text is composed 
of but a very limited number of different characters. They are handled in exactly the same man- 
ner as are uniliteral, monoalphabetic substitution ciphers. So long as the same character, or 
comhination of characters, is always used to represent the same plain-text letter, and so long as a 
given letter of the plain text is always represented by the same character or combination of 
characters, the substitution is strictly monoalphabetic and can be handled in the simple manner 
described under Par. 31 of this text. 


e. An interesting example in which the cipher equivalents are quinqueliteral groups a.nd yet 


the resulting cipher is strictly monoalphabetic in character is found in the cipher system invented 
by Sir Francis Bacon over 300 years ago. Despite its antiquity the system possesses certain 
f-eatures of merit which are well worth noting. Bacon 1 proposed the following cipher alphabet, 
composed of permutations of two elements taken five at a time:• 


A=aaaaa 
I-J=abaaa 
R=baaaa 
B=aaaab 
K=abaab 
S=baaab 
C=aaaba 
L=ababa 
T=baaba 
D=aaabb 
M=ababb 
0.-:V=baabb 
E=aabaa 
tl=abbaa 
_W=babaa 
F=aabab 
O=abbab 
X=babab 


G=aabba 
P=abbba 
Y=babba 
H=aabbb 
Q=abbbb 
Z=babbb 
If this were all there were to Bacon's irivention it would be hardly worth bringing to attention. 
But what he pointed out, with great clarity and simple examples, was how such an alphabet 
might be used to convey a secret message by enfolding it in an innocent, external message which 
might easily evade the strictest kind of censorship. As a very crude example, suppose that a 
message is written in capital and lower ease letters, any capital letter standing for 8.n "a" element 
of the cipher alphabet, and any small letter, for a "b" element. Then the external sentence 
"All is well with me today" can be made to contain the secret message "Help." Thus: 
A L 1 
i s 
a ab 
b b 
H 


W E 1 L 
W I t H 
m E 
aaba aaba 
ba 
----- 
E 
L 


I 


T o d a Y 
ab b b a 
p 


Instead of employing such an obvious device as capital and small letters, suppose that an "a" 
element be indicated by a very slight shading, or a very slightly heavier stroke. Then a secret 
message might easily be thus enfolded within an external message of exactly opposite meaning. 
The number of possible variations of this basic scheme is very high. The fact that the characters 


' For a true picture of this cipher, the explanation of which is often: distorted beyond recognition even by cryp- 


tographers, see Bacon's own description of it as contained in his De Augmenti11 Scientiarum (The Advancement of 
Learning), as translated by any first-class editor, such as Gilbert Watts (1640) or Ellis, Spedding, and Heath 
(1857, 1870). The student is cautioned, however, not to accept as true any alleged "decipherments" obtained 
by the application of Bacon's cipher to literary works of the 16th century; These readings are purely subjective. 


•In the 16th Century, the letters I andJ were used interchangeably, as were also U and v. 
Bacon's alphabet 


was called by him a "biliteral alphabet" because it employs permutations of two letters. But from the cryptan- 
alytic standpoint the significant point is that each plain-text letter is represented by a 5-character equivalent. 
Hence, present terminology requires that this alphabet be referred to as a quinqueZiteraZ alphabet. 
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of the cryptographic text are hidden in some manner or other has, however, no effect upon the 
strict monoalphabeticity of the scheme. 
· 


86. Historically interesting examples.-a. Two examples of historical interest will be cited 
in this connection as illustrations. During the campaign for the presidential election of 1876 
many cipher messages were exchanged between the Tilden managers and their agents in several 
states where the voting was hotly contested. Two years later the New York Tribune ' exposed 
many irregularities in the campaign by publishing the decipherments of many of these messages. 
These decipherments were achieved by two investigators employed by the Tribune, and the 
plain text of the messages seems to show ihat illegal attempts and measures to carry the election 
for Tilden were made by his managers. Here is one of the messages: 


GEO. F. RANEY, Tallahassee. 


JACKSONVILLE, Nov. 16 (1876). 
Ppyyemnsnyyypimashnsyyssitepaaenshns 
pensshnsmmpiyysnppyeaapieissyeshainsssp 
eeiyyshnynsssyepiaanyitnsshyyspyypinsyy 
ssitemeipimmeisseiyyeissiteiepyypeeiaass 
imaayespnsyyianssseissmmppnspinssnpinsim 
imyyitemyysspeyymmnsyyssitspyypeepppma 
a a y y p i i t 
L'Engle goes up tomorrow. 
DANIEL. 


Examination of the message discloses that only ten different letters are used. It is probable, 
therefore, that what one has here is a. cipher which employs a bipartite alphabet and in which 
combinations of two letters represent single letters of the plain text. The message is therefore 
rewritten in pairs and substitution of arbitrary letters for the pairs is made, as seen below: 


PP 
YY 
EM 
NS 
NY 
YY 
PI 
MA 
SH 
NS 
YY 
SS etc. 
A 
B 
C 
D 
E 
B 
F 
G 
H 
D 
B 
I 
etc. 
A triliteral frequency distribution is then made and analysis of the message along the lines 
illustrated in the preceding section of this text yields solution, as follows: 


GEo. F. RANEY, TaUahassee: 
JACKSONVILLE, Nov. 16. 


Have Marble and Coyle telegraph for influential men from Delaware and Virginia. Indi- 
cations of weakening here. Press advantage and watch Board. L'Engle goes up tomorrow. 


s. 


b. The other example, using numbers, is as follows: 


JACKSONVILLE, 
PASCO and E. M. L'ENGLE: 


84 55 84 25 93 34 82 31 
31 75 93 82 
93 
20 
93 66 77 66 33 84 66 31 
31 
93 


52 48 44 55 42 82 48. 89 42 93 31 
82 


'New York Tribune, Extra No. 44, The Cipher Diapatche11, New York, 1879. 
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77 33 


20 
82 


66 75 


DANIEL. 


Nov. 17. 


55 42 


33 66 


31 93 


DANIEL. 
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There were, of course, several messages of like nature, and examination disclosed that 
only 26 different numbers in all were used. Solution of these ciphers followed very easily, the 
decipherment of the one given above being as follows: 


JACKSONVILLE, Nov. 17. 


S. PAsco and E. M. L'ENGLE: 
Cocke will be ignored, Eagan called in. Authority reliable. 


DANIEL. 


c. The Tribune experts gave the following alphabets as the result of their decipherments: 


AA=O 
EN=Y 
IT=D 
NS=E 
PP=H 
SS=N 
AI=U 
EP=C 
MA=B 
NY=M 
SH=L 
YE=F 


EI=I 
IA=K 
MM--G 
PE=T 
SN=P 
YI=X 
EM=V 
IM=S 
NN::;J 
PI=R 
SP=W 
YY=A 


20=D 
33=N 
44=H 
62=X 
77=G 
89=Y 


25=K 
34=W 
48=T 
66=A 
82=1 
93=E 
27=5 
39=P 
52=U 
68=F 
84=C 
96=M 


3l=L 
42=R 
55=0 
75=B 
87=V 
99=J 
They did not attempt to correlate these alphabets, or at least they say nothing about a possible 
relationship. The present author has, however, reconstructed the rectangle upon which these 
alphabets are based, and it is given below (fig. 16). 


H l 


I 2 


~ s 3 
Q) 
'S 
p 4 


::s z 
A 5 


~ 
0 


~ y 6 
..... 
..... 
Q) 
M7 
~ 
..... 
Ill 
E 8 
..-t 


N 9 


T 0 


2d Letter or Number 


H 
I 
S 
P 
A 
Y 
M 
E 
N 
'l 
12 34 5 
67 8 
9 
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L 
N W 
.___, _ _. __ 
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u 
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A 
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G 
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I 
c 
v 
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E 
M 
J 


FIGURJ: 16. 
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It is amusing to note that the conspirators selected as their key a phrase quite in keeping with 
their attempted illegalities-HIS PAYMENT-for bribery seems to have played a considerable 
part in that campaign. The blank squares in the diagram probably contained proper names, 
numbers, etc. 
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SECTION VIII 


MULTILITERAL SUBSTITUTION WITH MULTIPLE-EQUIVALENT CIPHER 
ALPHABETS 
Parqraph 


Purpose of providing multiple-equivalent cipher alphabets------------------------------------------------------- 
37 
Solution of a simple example----------------------------------------·····------~-------------------------.:.----------------- 
38 
Solution of more complicated example--···----------------------------·-----·-----------------------------·----------- 
39 
A subterfuge to prevent decomposition of cipqer text into component units---·-·------·- 
40 
S7. Purpose of providing multiple-equivalent cipher alphabets.-a. It has been seen that 
the characteristic frequencies of letters composing normal plain text, the associations they form 
in combining to form words, and the peculiarities certain of them manifest in such text all afford 
direct clues by means of which ordinary monoalphabetic substitution encipherments of such 
plain text may be more or less speedily solved. This has led to the introduction of simple 
methods for disguising or suppressing the manifestations of monoalphabeticity, so far as possible. 
Basically these methods are multiliteral and they will now be presented. 
b. Multiliteral substitution may be of two types: (1) That wherein each letter of the plain 
text is represented by one and only one muJtiliteral equivalent. For example, in the Francis 
Bacon cipher described in Par. 35e, the letter Kp is invariably represented by the permutation 
abaab. For this reason this type of system may be more completely described as monoa/,pha- 
betic, multiliteral 8'Ub8titution 'IJ!i,th Bingle-equivalent cipher alphabet8. 
(2) That wherein, because of the large number of equivalents made available by the com- 


binations and permutations of a limited number of elements, each letter of the plain text may be 
represented by several multiliteral equivalents which may be selected at random. For example, 
if 3-letter combinations 0ire employed there are available 263 or 17,576 equivalents for the 26 
letters of the plain text; they may be assigned in equal numbers of different equivalents for the 
26 letters, in which case each letter would be representable by 676 different 3-letter equivalents~ 
or they may be assigned on some other basis, for example, proportionately to the relative 
frequencies of plain-text letters. For this reason this type of system may be more completely 
described as monoalphabetic, multiliteral 8ubstitution with multiple-equivalem cipher a/,phabets. 
Some authors term such a system "simple substitution with multiple equivalents"; others term 
it monoalphabetic substitution wi,th variant8. For the sake of brevity, the latter designation will 
be employed in this text. 
c. The primary object of monoalphabetic substitution with variants is, as has been men- 
tioned above, to provide several values which may be employed at random in a simple substitution 
of cipher equivalents for the plain-text letters. In this connection, reference is made to Section 
X of Elementary Military Cryptography, wherein several of the most common methods for 
producing and using variants are set forth. 
d. A word or two concerning the underlying theory from the cryptanalytic point of view of 
monoalphabetic substitution with variants, may not be amiss. Whereas in simple or single- 
equivalent, monoalphabetic substitution it is seen that- 
(1) The same letter of the plain text is invariably represented by but one and always the 
same character of the cryptogram, and 
(63) 
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(2) The same character of the cryptogram invariably represents one and always the same 
letter of the plain text; 
In multiliteral substitution with multiple equivalents (monoalphabetic substitution with 
variants) it is seen that- 


(1) The same letter of the plain text may be represented by one or more different characters 
of the cryptogram, but 
(2) The same character of the cryptogram nevertheless invariably represents one and always 
the same letter of the plain text. 
' 
38. Solution of a. simple exa.mple.-a. The following cryptogram has been enciphered by a 
set of four alphabets similar to the following: 


A B C D E F 
G H I-J K 
L 
M N 0 
P Q R S T 
U V 
W X Y 
Z 
08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 01 02 03 04 05 06 07 
35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 26 27 28 29 30 31 32 33 34 
68 69 70 71 72 73 74 75 51 52 53 54 55 56 57 58 59 60 61 52 63 64 65 66 67 
fr! 88 89 90 91 92 93 94 95 96 97 98 99 00 76 77 78 79 80 81 82 83 84 85 86 


The keyword here is TRIP 1• In enciphering a message the equivalents are to be selected at 
random from among the four variants for each letter. The steps in solving a message produced 
by such a scheme will now be scrutinized. 


CRYPTOGRu 


68321 09022 48057 65111 88648 42036 45235 09144 05764 22684 
00225 57003 'R357 14074 82524 40768 51058 93074 92188 47264 
09328 04255 06186 79882 85144 45886 32574 55136 56019 45722 
76844 68350 45219 71649 90528 65106 11886 44044 89~69 70553. 
18491 06985 48579 33684 50957 70612 09795 29148 56109 08546 
62062 65509 32800 32568 9'7216 44282 34031 84989 68564 53789 
1Z530 77401 68494 38544 11368 87616 56905 20710 58864 S7472 
22490 09136 62851 24551 35180 14230 50886 44084 06231 12f!'l6 
05579 58980 29503 99713 32720 36433 82689 04516 52263 21175 
06445 72255 68951 86957 76095 67215 53049 08567 'R30 


b. Assuming that the foregoing remarks had not been made and that the cryptogram has 
just been submitted for solution with no information concerning it, the first step is to make a 
preliminary study to determine whether the cryptogram involves cipher or code. The crypto- 
gram appears in 5-figure groups, which may indicate either cipher or code. A few remarks will 
be made at this point with reference to the method of determining whether a cryptogram com- 
posed of figure groups is in code or cipher, using the foregoing example. 
c. In the first place, if the cryptogram contains an even number of digits, as for example 


-4:94 in the foregoing message, this leaves open the possibility that it may be cipher, composed of 
247 pairs of digits; were the number of digits an exact odd multiple of five, such as 125, 135, etc., 
the possibility that the cryptogram is in code of the 5-figure group type must be con.Sidered. Next, 
a preliminary study is made to see if there are many repetitions, and what their characteristics 


1 The letter corresponding to the lowest number in each line of the diagram showing the cipher alphabets 
is a key letter. Thus, in the 1st line Ol=T; in the 2d line 26=R; etc. 
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are. If the cryptogram is code of·the 5-figure group type, then such repetitions as appear should 
general.ly be in whole groups of five digits, and they should be visible in the text just as the mes- 
sage stands, unless the code message has undergone encipherment also. If the cryptogram is in 
cipher, then the repetitions should extend beyond the 5-digit groupings; if they conform to any 
definite groupings at all they should for the most part contain even numbers of digits since each' 
letter is probably represented by a pair of digits. If no clues of the foregoing nature are present, 
doubts will be dissolved by making a detailed study of frequencies. 
d. A simple 4-part frequency distribution is therefore decided upon. Shall the alphabet be 
assumed to be a 25- or a 26-character one? If the former, then the 2-digit pairs from 01 to 00 
fall into exactly four groups each corresponding to an alphabet. Since this is the most common 
scheme of drawing up such alphabets, let it be assumed to be true of the present case. The 
following distributions result from the breaking up of the text into 2-digit pairs. 


01-111 
26-/// 
51-iHI. 
76-iH/. / 


02- 
27- 
52-iH/. 
77-/ 


03-//// 
28-/ 
53-/// 
78- 


04-/ 
29-/ 
54- 
79-/ 


05-iHI. 
30-1// 
55-//// 
80-1// 
06-iH/. I 
31- 
56-iHI. 
8 l- 


01-111 
32_;iHJ. I 
57-iHI. / 
82-////. 


08- 
33-/ 
58-// 
83-/ 


09-//// 
34-/ 
59- 
84-iHI. I 
10-//// 
35-// 
60- 
85-iHI. I 
11-iH/. 
36-!H/ 
61- 
86-/// 
12-/// 
37-/ 
62-// 
87- 
13-/ 
38- 
63- 
88-//// 
14-/ 
39-/ 
64-iHI. I 
89-iHI. 


15-/ 
40-/// 
65- 
90--iHI. I 


16-/// 
41- 
66-/ 
91-1// 


17- 
42-//// 
67-// 
92-/ 


18-iH/. I 
43-/ 
68-iHI. II 
93-/ 
19- 
44-iHI. I 
69-1/ 
94-/ 


20-/ 
45-iHI. I 
10--1 
95-/// 
21-// 
46-/// 
71-/ 
96- 


22-iH/. 
47- 
72-//// 
97-iHI. I 


23-/ I 
48-11/ 
73- 
98-/ 


24- 
49-iH/. 
74-//// 
99- 
25-/ 
50-iH/. 
75-/ 
00-// 


e. If the student will bring to bear upon this problem the principles he learned in Section V 


of this text, he will soon realize that what he now has before him are four, simple, monoalpha- 
betic frequency distributions similar to those involved in a monoalphabetic substitution cipher 
using standard cipher alphabets. The realization of this fact immediately provides the clue to 
the next step: "fitting each of the distributions to the normal." (See Par. 17b). This can be 
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done without difficulty in this case (remembering that a 25-letter alphabet is involved and 
assuming that I and J are the same letter) and the following alphabets result: 


01-I-J 26---U 
51-N 
76-E 


02-K 
27-V 
52-0 
77-F 


03-L 
28-W 
53-P 
78--G 


04--M 
29--X 
54-Q 
79-H 


05-N 
30-Y 
55-R 
80-I-J 
06-0 
31-Z 
56-S 
81-K 


07-P 
32-A 
57-T 
82-L 


08-Q 
33-B 
58-U 
83-M 


09--R 
34-C 
59-V 
84-N 


10-S 
35-D 
60-W 
85-0 


11-T 
36-E 
61-X 
- 86-P 


12-U 
37-F 
62-Y 
87-Q 


13-V 
38--G 
63-Z 
88-R 


14-W 
39-H 
64-A 
89-S 


15-X 
40-I-J 65-B 
90-T 


16-Y 
41-K 
66-C 
91-U 
17-Z 
42-L 
67-D 
92-V 


18-A 
43-M 
68-E 
93-W 


19-B 
44-N 
69--F 
94-X 


20-C 
45-0 
7o--G 
95-Y 


21-D 
46--P 
71_......H 
96-Z 


22-E 
47-Q 
72-I-J 97-A 
23-F 
48-R 
73-K 
98-B 
24--G 
49--S 
74-L 
99-C 


25-H 
50-T 
75-M 
00-D 


f. The keyword is seen to be JUNE and the first few groups of the cryptogram decipher as 


follows: 
68 32 10 90 22 48 05 76 51 
E 
A 
S 
T 
E 
R 
N 
E 
N 
11 88 64 84 20 36 45 23 
T 
R 
A 
N 
C 
E 
0 
F 


g. From the detailed procedure given above, the student should be able to draw his own 
conclusions as to the procedure to be followed in solving cryptograms produced by methods 
which are more or less simple variations of that just discussed. In this connection he is referred 
to Section X of Elementary Military Cryptography, wherein a few of these variations are mentioned. 


h. Possibly the most important of the variations is that in which a rectangle such as that 
shown in Fig. 17 is employed. 


1 
2 
3 
4 
5 
6 
7 
8 
9 
0 


- 
- 
- 
- 
- 
- 
- 
- 
- 
'= 


1, 4, 7 
A 
B c 
D 
E 
F 
G 
H 
I 
J 
- 
- 
- 
- 
- 
- 
- 
- 
- 


2, 5, 8 
K 
L 
M N 
0 
p 
Q 
R s 
T 
- 
- 
- 
- 
- 
- 
- 
- 
- 


3, 6, 9 
u v w x y z - 
' 
: 
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In the solution of cases of this kind, repetitions would play their usual role, with the modifications 
noted below in Par. 39. Once an entering wedge has been forced, through the identification 
of one or more repeated words such as BATTALION, DIVISION, etc., the entire enciphering 
rectangle would soon be reconstructed. It may be added that the frequency distribution for 
the text of a single long message or several short ones enciphered by such a system would show 
characteristic phenomena, the most important of which are, first, that the distribution for a 
rectangle such as shown in Fig. 17 would practically follow the nqrmal and, second, that the 
distribution for the 2d digit of pairs would show more marked crests and troughs than the 
distribution for the 1st digit. For example, the initial digits 1, 4, and 7 (for the numbers 10-19, 
4o-49, and 70-79, inclusive) would apply to the distribution for the letters A to J, inclusive; the 
initial digits 2, 5, and 8 would apply to the distribution for the letters K to T, inclusive. The 
total weighted frequency values for these two groups of letters are about equal. Therefore, 
the frequencies of the initial digits 1, 2, 4, 5, 7, and 8 would be approximately equal. But 
consider the final digit 5 in the numbers 15, 45, 75, 25, 55, and 85; its total frequency is com- 
posed of the frequency of Ep plus the frequency of Op; whereas in the case of the final digit 6, 
its total frequency is composed of the frequency of Fp plus the frequency of QP. 
The two cases 


would show a marked difference in frequency. Of course, the letters may be inserted within 
the enciphering rectangle in a keyword-mixed or even in a random order; the numbers may be 
applied to the rectangle in a random order. But these variations, while increasing the difficulty 
in solution, by no means make the latter as great as may be thought by the novice. 
39. Solution of a more complicated example.-a. As soon as a beginner in cryptography 
realizes the consequences of the fact that letters are used with greatly varying frequencies 
in normal plain text, a brilliant idea very speedily comes to him. Why not disguise ·the 
natural frequencies of letters by a system of substitution using many equivalents, and let 
the numbers of equivalents assigned to the various letters be more or less in direct proportion 
to the normal frequencies of the letters? Let E, for example, have 13 or more equivalents; T, 10; 
N, 9; etc., and thus (he thinks) the enemy cryptanalyst can have nothing in the way of tell-tale 
or characteristic frequencies to use as an entering wedge. 
b. If the text available for study is small in amount and if the variant values are wholly 
independent of one another, the problem can become exceedingly difficult. But in practical 
military communications such methods are rarely encountered, became the volume of text is moolly 
great enough to permit of the establishment of equivalent values. To illustrate what is meant, 
suppose a set of cryptograms produced by the monoalphabetic-variant method described above 
shows the following two sets of groupings in the text: 


SETA 
12-37-02-79-68-13-03-37-77 
82-69-03-79-13-68-23-37-35 
82-69-51-16-13-13-78-05-:-35 
91-05-02-01-68-42-78-37-77 


SETB 
71-12-02-51-23-05-77 
11-82-51-02-03-05-35 
11-91-02-02-23-37-35 
97-12-51-03-78-69-77 


An examination of these groupings would lead to the following tentative conclusions with regard 
to probable equivalents: 
12, 82, 91 
05, 37, 69 
02, and 51 


01, 16, 79 
13, 42, 68 


03, 23, 78 
35, and 77 


The establishment of these equivalencies would sooner or later lead to the finding of additional 
sets of equal values. The completeness with which this can be accomplished will determine 
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the ease or difficulty of solution. Of course, if many equivalencies can be established the 
problem can then be reduced practically to monoalphabetic terms and a speedy solution can 
be attained. 
· 


c. Theoretically, the determination of equivalencies may seem to be quite an easy matter, 
but practically it may be very difficult, because the cryptanalyst ean never be certain that a 
combination showing what may appear to be a variant value is really such, and is not a different 
word. For example, take the groups- 
17-82-31-82-14-63, and 
27-82-40-82-14-63 
Here one might suspect that 17 and 27 represent the same letter, 31and40 another letter. But 
it happens that one group represents the word MANAGE, the other DAMAGE. 
d. When reversible combinations are used as variants, the problem is perhaps a bit more 
simple. For example, using the accompanying Fig. 18 for encipherment, two messages with 
the same initial words, REFERENCE YOUR, may be enciphered as follows: 


(1) 
(2) 


R 
E 
F 
NHWDR 
CH D WR 


K,Z 
Q,V 
B,H 
M,R 
D,L 


W,S 


F,X 


G,J 


C,N 


P,T 


N 


D 


Q 


G 
z 


E 
R 
X L. S H C 
XSLHN 


H 
A 
0 
E 
- 
- 
T 
M 
F 
p 
-- 
B 
u 
I 
v 
- 
- 
x 
R 
c 
- 
- 
L 
y 
w 


Jl'JGUBJ: 18. 
E 
N 
C 


D w w z N 
D W Z W N 


s 


K 


E 
y 


RS L HP 
RLSHP 


0 
U 
R 


SRBJC 
H 
RWJBN 
H 


The experienced cryptanalyst, noting the appearance of the very first few groups, assumes that 
he is here confronted with a case involving biliteral reversible equivalents, with variants. 
e. The probable-word method of solution may be used, but with a slight variation intro- 
duced by virtue of the fact that, regardless of the system, letters of low frequency in plain tezt 
remain infrequent. Hence, suppose a word containing low-frequency letters, but in itself a 
rather common word strikingly idiomorphic in character is sought as a "probable word"; for 
example, words such as .QAyALRX, ATTAC~, and ,ERE,EARE. Writing such a word on a slip of 
paper, it is slid one interval at a time under the text, which has been marked so that the high 
and low-frequency characters are indicated. Each coincidence of a low-frequency letter of the 
text with a low-frequency letter of the assumed word is examined carefully to see whether the 
adjacent text letters coITespond in frequency with the other letters of the assumed word; or, if 
the latter presents repetitions, whether there are correspondences between repetitions in the 
text and those in the word. Many trials are necessary but this method will produce results 
when the difficulties are otherwise too much for the cryptanalyst to overcome. 


40. A subterfuge to prevent decomposition of cipher text into component units.-a. A few 
words should be added with regard to certain subterfuges which are sometimes encountered in 
monoalphabetic substitution with variants, and which, if not recognized in time, cause con- 
siderable delays. These have to deal with the insertion of nulls so as to prevent the cryptanalyst 
from breaking up the text into its real cryptographic units. The student should take careful 
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note of the last phrase; the mere insertion of symbols having the same characteristics as the 
symbols of the cryptographic text, except that they have no meaning, is not what is meant. 
This class of nulls rarely achieves the purpose for which they are intended. What is really meant 
can best be explained in connection with an example. Suppose that a 5 x 5 checkerboard design 
with the row and column indicators shown: in Fig. 19 is adopted for encipherment. Normally, 
the cipher units would consist of 2-letter combinations of the indicators, invariably giving the 
row indicator first (by agreement). 


V,A,T,F 


G,H,O,U 


I,P,E,R 


W,S,B,L 


D,M,N,C 


V 
G 
A 
H 
T 
0 


F 
U 


A B 
- 
F 
G 


L 
M 


Q R 
- 
v w 


I 
-W 
p s 
E 
B 
R L 


c D 
- 
- 
H I-J 


N 0 
- 
- 
s 
T 
- 
- 
x 
y 


FIGVU 111. 


D 
M 
N 
c 


E 


K 


p 


u 
z 


The phrase COMMANDER OF SPECIAL TROOPS might be enciphered thus: 


C 
0 
M 
M 
A 
N 
D 
E 
R 
0 
F 


n 
~ m 
ru 
" 
m 
~ ~ w 
N 
~ 


These would normally then be arranged in 5-letter groups, thus: 
VIEBP 
HIUFT 
I EA BT 
MWOPW 
GT ... 


· b. It will be noted, however, that only 20 of the 26 letters of the alphabet have been employed 


as row and column indicators, leavingJ, K, Q, X, Y, and Z unused. Now, suppose these five letters 
are used as nulls, not in pairs, but as individual letters inserted at random just before the real text is 
arranged in 5-letter groups. Occasionally, a pair of nulls is inserted. 
Thus, for example: 


VIEXB PHKIU FJXTI EAJBT MWOQP WGKTY 
The cryptanalyst, after some study, suspecting a biliteral cipher, proceeds to break up the text 
into pairs: 
VI 
EX 
BP 
HK 
IU FJ 
XT 
IE 
AJ 
BT 
MW 
OQ 
PW 
GK 
TY 


Compare this set of 2-letter combinations with the correct set. Only 4 of the 15 pairs are "proper" 
units. It is easy to see that without a knowledge of the existence of the nulls, and even with a 
knowledge, if he does not know which letters are nulls, the cryptanalyst would be confronted with 
a problem for the solution of which a fairly large amount of text might be necessary. The 
careful employment of the variants n.lso very materially adds to the security of the method be- 
cause repetitions can be rather effectively suppressed. 
c. From the cryptographic standpoint, the fact that in this system the cryptographic text 
is more than twice .as long as the plain text constitutes a serious disadvantage. From the 
cryptanalytic standpoint, the masking of the cipher units constitutes the most important source 
of strength of the system; this, coupled with the use of variants, makes it a bit more difficult 
system to solve, despite its monoalphabeticity. 
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41. ltonographic and polygraphic substitution systems.-a. The student is now referred 
to Sections VII and VIII of Advanced Military Oryptography, wherein polygraphic systems of 
substitution are discussed from the cryptographic point of view. These will now be discussed 
from the cryptanalytic point of view. 
b. Although the essential differences between polyliteral and polygraphic substitution a.re 
treated with some detail in Section VII of Advanced Military Oryptography, a few additional 
words on the subject ma.y not be ,amiss at this point. 
c. The two primary divisions of substitution systems into (1) uniliteral and multiliteral 
methods a.nd into (2) monogra.phic a.nd polygra.phic ln.ethods are both based upon considerations 
as to the number of eleme'llJ8 constituting,the plain-text ~d the equivalent cipher-text unit&. In 
uniliteral as well as in monographic substitution, eacl;i. plain-text unit consists of_ a single element 
and each· cipher-text unit consists of a single element. The two terms uniliteral and mono- 
graphic are therefore identical in significance, as defined cryptographically. It is when the 
terms multiliteral and polygraphic a.re examined· that an essentia.1 difference is seen. In multi- 
literal substitution the plain-text unit always consists of a. single element (one letter) and the 
cipher-text unit consists of a group of two or more elements; when biliteral, it is a pair of elements, 
when triliteral, it is a set of three elements, and so on. Iii what will herein be designated as 
true or complete polygraphic substitution the plain-text unit consists of two or more elements 
forming a.n indivisible compound; the cipher-text unit usually consists of a corresponding number 
of elements.1 When the number of elements comprising the plain-text units is fixed and always 
two, ~he system is digraphic,- when it is three, the system is trigraphic; when it is four, tetra- 
graphic; and so on.2 It is important to note that in true or complete polygraphic substitution 
the elements combine to form indivisible compounds having properties different from those of 
either of the constituent letters. For example, in uniliteral substitution ABP may yield XY0 and 
ACp may yield XZei but in true digraphic substitution ABP may yield XYa and ACP may yield QN0 • 
A difference in identify of one letter affects the whole result.8 An analogy is found in chemistry, 
when two elements combine to form a molecule, the latter usually having properties quite 
different from those of either of the constituent elements. For example: sodium, a metal, a.nd 


1 The qualifying adverb "usually" is employed because this correspondence is not essential. For example, 
if one should draw up a set of 676 arbitrary single signs, it would be possible to represent the 2-letter pairs from 
AA to ZZ by single symbols. 
This would still be a digraphic system. 


' In this sense a code system is merely a polygraphic substitution system in which the number of elements 


constituting the plain-ten units is variable. 
a For this reason the two letters are marked by a ligature, that is, by a bar acrof38 their tops. 
(70) 
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chlorine, a gas, combine to form. sodium chloride1 common table salt. Furthermore, sodium aad - 
fluorine, also a gas similar in many respects to chlorine, combine to form sodium fluoride, which 
is much different from table salt. Partial and pseudo-polygra.phic substitution will be treated 
under subpara.gra.phs d and e below. 
d. Another way of looking a.t polygraphic substitution is to regard the elements comprising 
the plain-text units as being enciphered individually and polyalphabetica.lly by a. fairly large 
number of separate alphabets. For example, in a digraphic system in which 676 pairs of pla.in- 
text letters are representable by 676 cipher-text pairs assigned a.t random, this is equivalent to 
having a set of 26 different alphabets for enciphering one member of the pairs, and another set 
of 26 different alphabets for enciphering the other member of the pairs. According to this 
viewpoint the different alpha.bets a.re brought into play by the particular combination of letters 
forming ea.ch plain-text pair. This is, of course, quite different from systems wh&.ein the various 
alphabets a.re brought into play by more definite rules; it is perhaps this very absence of definite 
rules guiding the selection of alpha.bets which constitutes the cryptographic strength of this type 
of polygra.phic system. 


e. When regarded in the light of the preceding remarks, certain systems which a.t first glance 
seem to be polygra.phic, in that groupings of plain-text letters a.re treated as units, on closer 
inspection a.re seen to be only partially polygra.phic, or pseudo-polygraphic in character. For 
example, in a system in which encipherment is by pairs and yet one of the letters in ea.ch pair is 
enciphered monoalphabetically, the other letter, polyalphabetica.lly, the method is only pB'l.Udo- 
polygraphic. Cases of this type are shown in Section VII of AdvancM- Military Oryptogrq,phy. 
Again, in a. system in which encipherment is by pairs and the encipherments of the left-hand 
and right-hand members of the pairs show group relationships, this is not pseudo-polygra.phic 
but only partially polygra.phic. Cases of this type are also shown in the text referred to above. 
j. The fundamental purpose of polygra.phic substitution is again the suppression of the 
frequency characteristics of plain text, just as is the case in monoalphabetic substitution with 
variants; but here this is accomplished by a different method, the latter arising from a. somewhat 
different approach to the problem involved in producing cryptographic security. When the sub- 
stitution involves replacement of single letters in a monoalphabetic system, the cryptogram can 
be solved rather readily. Basically the reason for this is that the principles of frequency and the 
laws of probability, applied to individual units of the text (single letters), have a very good 
opportunity to manifest themselves. A given volume of text of say n plain-text letters, enciphered 
purely monoa.lphabetically, affords n cipher characters, and the same number of cipher units. 
The same volume of text, enciphered digraphically, still affords n cipher characters but only 


~ cipher units. Statistically speaking, the sample within which the laws of probability now apply 
has been cut in half. Furthermore, from the point of view of frequency, the very noticeable 
diversity in the frequencies of illdividual letters, leading to the marked crests and troughs of 
the uniliteral frequency distribution, is no longer so strikingly in evidence in the frequencies of 
digraphs. Therefore, although true digraphic encipherment, for example, cuts the cryptographic 
textual units in half, the difficulty of solution is not doubled, but, if a matter of judgment arising 
from practical experience can be expressed or approximated mathematically, squared or cubed. 
g. Sections VII and VIII of Advanced Military Oryptography show various methods for the 
derivation of polygraphic equivalents and for handling these equivalents in cryptographing and 
decryptographing messages. The most practicable of those methods are digraphic in character 
and for this reason their solution will be treated in a somewhat more detailed manner than will 
trigraphic methods. The latter can be passed over with the simple statement that their analysis 
requires much text to permit of solution by the frequency method, and hard labor. Fortunately, 
they are infrequently encountered because they are difficult to manipulate without extensive 
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tables.' If the latter a.re required they must be compiled in the form of a book or pamphlet. If 
one is willing to go that far, one might as well include in such document more or less extensive lists 
of words and phrases, in which case the system falls under the category of code and not cipher. 
42. Tests for identifying digraphic substitution.-a. The tests which are applied to deter- 


mine whether a given cryptogram is digraphic in character are usually rather simple. If there 
are many repetitions in the cryptogram and yet the uniliteraJ-frequenoy distribution gives no 
clear-cut indications of monoalphabeticity; if most of the repetitions contain an even number 
of letters; and if the cryptogram contains an even number of letters, it may be assumed to be 
digta.phic in nature. 
b. The student should first try to determine whether the substitution is completely digraphic, 
or only partially digraphic, or pseudo-digraphic in character. As mentioned above, there are 
cases in which, although the substitution is effected by taking pairs of letters, one of the members 
of the pairs is enciphered monoalphabetically, the other member, polyalphabetically. A dis- 
tribution based upon the letters in the odd positions and one based upon those in the even 
positions should be made. If one of these is clearly monoalphabetic, then this is evidence that the 
message represents a case of pseudo-digraphism of the type here described. By attacking the 
monoalphabetio portion of the messages, solution can soon be reached by slight variation of the 
usual method, the polyalphabetic portion being solved by the aid of the context and considera- 
tions. based upon the probable nature of the substitution chart. 
(See Tables 2, 3, and 4 of 


Admnw1, Military OryJ>tography.) It will be noted that the charts referred to show definite 
symmetry in their construction. 
c. On the other hand, if the foregoing steps prove fruitless, it may be assumed that the 
cryptogram is completely digraphio in character. 
d. Just as certain statistical tests may be applied to a cryptogram to establish its mono- 


alphabeticity, so also may a statistical test be applied to a cryptogram for the purpose of estab- 
lishing its digraphicity. The nature of this test and its method of application will be discussed 
in a subsequent text. 


43. General procedure in the analysis of digraphic substitution oiphers.-a. The analysis of 


cryptograms which have been produced by digraphic substitution is accomplished largely by 
the application of the simple principles of frequency of digraphs, with the additional aid ofsuch 
special circumstances as may be known to or suspected by the cryptanalyst. The latter refer 
to peculiarities which may be the result of the particular method employed in obtaining the 
equivalents of the plain-text digraphs in the cryptographing process. In general, however; 
only if there is sufficient text to disclose the normal phenomena of repetition will solution be 
feasible or possible. 
b. However, when a digraphic system is employed in regular service, there is little doubt 
but that traffic will rapidly accumulate to an amount more than sufficient to permit of solution 
by simple principles of frequency. 
Sometimes only two or three long messages, or a half dozen 
of average length are sufficient. For with the identification of only a few cipher digraphs, 
larger portions of messages may be read because the skeletons of words formed from the few 
high-frequency digraphs very definitely limit the values that can be inserted for the intervening 
unidentified digraphs. For example, suppose that the plain-text digraphs TH, ER, IN, IS, OF, 
NT, and TO have been identified by frequency considerations, corroborated by a tentatively 
identified long repetition; and suppose also that the enemy is known to be using a quadricular 


' A patent has been granted upon a rather ingenious machine for automatically accomplishing true poly- 
graphic substitution, but it has not been placed upon the market. See U.S. Patent No. 1845947 issued in 1932 
to Weisner and Hill. In U. S. Patent No. 1515680 issued to Henkcls in 1924, there is described a mechanism 
which also produces polygraphic substitution. 
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table of 676 cells containing digraphs showing reciprocal equive.lence between plain and clpher- 
text digraphs. Suppose the message begins as follows (in which the assumed values have been 
inserted): 


XQ 
VO 
ZI LK 
AP 
OL 
ZX 
PV 
QN 
IK 
OL 
UK 
AL 
HN 
LK 
VL 


FO 
TH 
IN 
NT 
RE 
NT 
NO 
IN 


BN 
OZ 
KU 
DY 
EL 
LE 
YW 


SI 
ON 
TO 


The words FOURTH INFANTRY REGIMENT are readily recognized. The reciprocal pairs ELo 
and LE0 suggest ATTACK. 
The beginning of the message is now completely disclosed: 
FOUR~ 


INFANTRY REGIMENT NOT YET IN POSITION TO ATTACK. 
The values more or less automati- 
cally determined are V00=URp, AL0=TYp, HN0=ETp, VLc=POp, OZc=Tip, YWo=CKp. 


c. Once a good start has been made and a few words have been solved, subsequent work 
is quite simple and straightforward. A knowledge of enemy correspondence, including data 
regarding its most common words and phrases, is of great assistance in breaking down new 
digraphic tables of the same nature but with different equivalents. 
d. The foregoing remarks also apply to the details of solution in cases of partially 
digraphic substitution. 
. 


44. Analysis of digraphic substitution ciphers based upon 4-square checkerboard designs.- 


a. In Section VIII of Advanced Military Oryptography there are shown various examples of di- 
graphic substitution based upon the use of checkerboard designs. These may be consider~d 
cases of partially digraphic substitution, in that in the checkerboard system there are certain 
relationships between plain-text digraphs having common elements and their correspon~g 
cipher-text digraphs, which will also have common elements. For example, take the followmg 
4-square checkerboard design: 


B w G 
R 
M 
0 
p 
A u L 


>---- - 
- 
- 
- 
>---- - 
- 
- 
,_ 
N 
y 
v x 
E 
H z 
Q 
D 
F 


>---- - 
- 
- 
,_ 


1 s 
I 
c 
T 
K 
K 
I 
T s c 3 


>---- - 
- 
- 
,_ - 
u 
p 
L 
A 
0 
M w R 
B 
G 
..__ - 
- 
D z 
F 
Q .H 
E 
y 
x 
N v 
- - - 
- - 
,_ 
w A 
L 
E s c x K 
p 
B 
- 
- 
- 
- 
- 
,_ 


F 
H u I 
T 
0 
M 
y 
D v 


,_______ - 
- 
- 
- 
,_ 


2 
p 
x 
B 
K c s 
A 
E w L 
4 


,_______ - 
- 
- 
- 
1- 


N z 
R 
Q 
G 
G z 
Q 
N 
R 


~ 
- 
- 
- 
- 
- 
D 
M v 
y 
0 
T 
H 
I 
F u 


Here BCp=OW0 , BOp=OF0 , BSp=OP0 , BGp=ON0 and BTp=OD0 • 
In each cas~ when BP.is the initial 
letter of the plain-text pair, the initial letter of the cipher-text equiva}ent 18 Oc. This, ?f course, 
is the direct result of the method; it means that the encipherment 18 monoalphabetic for the 
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first half of each of these.five plain-text pairs, polyalphabetic for the second half. This relation- 
ship holds true for four other groups of pairs beginning with BP. In other words, there are five 
alphabets employed, not 25. Thus, this case differs from the case discussed under Par. 42b 
only in that the monoalphabeticity is not complete for one-half of all the pairs, but only among 
the members of certain groups of pairs. In a completely digraphic system using a 676-cell 
randomized square, such relationships are entirely absent and for this reason the system is 
cryptographically more secure than the checkerboard system. 
b. From the foregoing, it is clear that when solution has progressed sufficiently to disclose 


a few values, the insertion of letters within the cells of the checkerboard design to give the plain- 
text and cipher relationships indicated by the solved values immediately leads to the disclosure 
of additional values. Thus, the solution of only a few values soons leads to the breakdown of 
the entire checkerboard ·design. 


c. (1) The following example will serve to illustrate the procedure. Let the message be as 


follows: 


1 2 3 
t 
6 
8 
7 8 
9 W 
ll 12.~ U W 
WU IB 
~ ~ 
21 
~ S 
~ ~ 
~ ~ ~ ~ 00 


A. 
H F C A P 
G 0 Q I L 
B S P K M N D U K E 
0 H Q N F 
B 0 R U N 


B. 
Q C L C H Q B Q B F 
H M A F X 
S I 0 K 0 
Q Y F N S 
X M C G Y 


. Q. 
X I F B E 
X A F D X L P M X H H R G K G Q K Q M L 
F E Q Q I • 
n.· .a O'I HM u E 0 RD c LT u F E Q Q c G Q NH F x ! F BE x 


E. FL BU Q F CH Q 0 Q M·A FT XS Y C B E PF NB S PKN U 


F. 
Q I T X E 
U Q M L F 
E Q Q I G 0 I E U E 
H P I A N Y T F L B 


G. 
F E E P I 
D H P C G N Q I H B F H M H F 
X C K U P 
D G Q P N 


H. 
C B C Q L 
Q P N F N P N I T 0 
R T E N C 
0 B C N T 
F H H A Y 


I. .,z L Q C I 
A A I Q U C H T P C B I F G W K F C Q S 
L Q M C B 


J. 
0 Y C R Q Q D P R X F N Q M L 
F I D G C C G I 0 G 0 I H H F 


K. 
I R C G G G N D L N 0 Z T F G E E R R P 
I F H 0 T 
F H H A Y 


L. 4 Z L Q C I 
A A I Q U C H T P 


(2) The cipher having been tested for standard alphabets (by the method of completing 
the normal components) and found to give negative results, a uniliteral-frequency distribution 
is made. It is as follows: 


::::: 
- 
I 
I 
I 
I 
::::: ~ 
I 
I 
.._1:::::11 
::: 
.._I 
.._II 11111 
~-1111 
.._.._ 
::::: 
111:::11111 
:::::1~1111~-11 
1::::: 
111111111 
11~11~11111 -11::: 
ABCDEFGHIJKLMNOPQRSTUVWXYZ 
ll W ~ 8 W 00 U ~ ~ 0 8 U ll IB W M ~ 9 
8 ll ll 0 
1 12 7 3 


FIGUBJ: 21, 


(3) At first glance this may appear to the untrained eye to be a monoalphabetic frequency 
distribution but upon closer inspection it is noted that aside from the frequencies of four or five 
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letters the frequencies for the remaining letters are not very dissimilar. There are, in reality, no 
very marked crests and troughs, certainly not as many as would be expected in a monoalphabetic 
substitution cipher of equal length. 
(4) The message having been carefully examined for repetitions of 4 or more letters, all of 
them are listed: 


TFHHAYZLQCIAAIQUCHTP (20 letters)-------------------------------------- 
QMLFEQQIGOI (11 letters)------------------------------------------------------ 
XIFBEX ( 6 letters)---------------------------------------------------------------- 
FEQQ __________________________________________________________________________________ _ 


QMLF----------------------------------------------------------------------------------- 
BFHM _________________________________________________________________________________ _ 


BSPK---------------------------------------------------------------------------------- 
GOIH _________________________________________________________________________________ _ 


Frequency 
Located in lines 


2 
HandK. 
2 
C and F. 
2 
C and D. 
3 C,D,F. 
3 C,F,J. 
2 
Band G. 
2 
A andE. 
2 
D andJ. 


Since there are quite a few repetitions, two of considerable length, since all but one of them 


contain an even number of letters, and since the message also contains an even number of letters, 
344, digraphic substitution is suspected. The cryptogram is transcribed in 2-letter_groups, for 
greater convenience in study. It is a.s follows: 


Message transcribed in pairs 


2 
3 
' 


15 
7 
8 
9 
10 
ll 
12 
13 
14 
115 


A. HF 
CA 
PG 
OQ 
IL BS 
PK 
MN 
DU 
KE 
OH 
QN 
FB 
OR 
UN 


B. QC 
LC 
HQ 
BQ 
BF 
HM 
AF 
XS 
IO 
KO 
QY 
FN 
SX 
MC 
GY 


C. XI 
FB 
EX 
AF 
DX 
LP 
MX 
HH 
RG 
KG 
QK 
QM 
LF 
EQ 
QI 


D. GO 
IH 
MU 
EO 
RD 
CL 
TU 
FE 
QQ 
CG 
QN 
HF 
XI 
FB 
EX 


E. FL 
BU 
QF 
CH 
QO 
QM 
AF 
TX 
SY 
CB 
EP 
FN 
BS 
PK 
NU 


F. QI 
TX 
EU 
QM 
LF 
EQ 
QI 
GO 
IE 
UE 
HP 
IA 
NY 
TF 
LB 


G. FE 
EP 
ID 
HP 
CG 
NQ 
IH 
BF 
HM 
HF 
XC 
KU 
PD 
GQ 
PN 


H. CB 
CQ 
LQ 
PN 
FN 
PN 
IT 
OR 
TE 
NC 
CB 
CN 
TF 
HH 
AY 


J. ZL 
QC 
IA 
AI 
QU 
CH 
TP 
CB 
IF 
GW 
KF 
CQ 
SL 
QM 
CB 


K. OY 
CR 
QQ 
DP 
RX 
FN 
QM 
LF 
ID 
GC 
CG 
IO 
GO 
IH HF 


L. IR 
CG 
GG 
ND 
LN 
OZ 
TF 
GE 
ER 
RP 
IF 
HO 
TF 
HH 
AY 


M. ZL 
QC 
IA 
AI 
QU 
CH 
TP 


It is noted that all the repetitions listed above break up properly into digraphs except in 
one case, viz, FEQQ in lines C, D, and F. This seems rather strange, and at first thought one 
might suppose that a letter was dropped out or was added in the vicinity of the FEQQ in line D. 
But it is immediately seen that the FE 
QQ in line D has no relation at all to the . F EQ 
Q. in 
lines C and F, and that the F EQ Qin line Dis merely an accidental repetition. 


--.. _, 


-~ I 
_, 
-·1 
=1 
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(5) A digraphic frequency distribution 1 is made and is shown in Fig. 22. 


A B C D E F G H I K L M N 0 P Q R S T U V W X Y Z 


A 
8 
2 
2 


B 
2 
1 
2 
1 
----------------·1--1--1--1--t-·l--l--t--l·--I 
c 
1 
5 
4 
3 
1 
1 
2 
1 
------------------------- 
D 
! 
1 
1 
------------------------- 
E 
1 
2 
2 
1 
1 
2 
------------------------- 
F 
3 
2 
1 
4 
------------------------·l--t--1 
G 
1 
1 
1 
3 
1 
1 
1 
------------------------- 
H 
4 
3 
2 
1 
2 
1 
------------------------- 
I 
3 
2 
1 
2 
3 
1 
2 
1 
1 
...._ _______________________ _ 


K 
1 
1 
1 
1 
1 


'-------------------------~-- 
L 
1 
1 
3 
1 
1 
1 
------------------------- 
M 
1 
1 
1 
1 
<-------------------------- 
N 
1 
1 
1 
1 
1 
L--l--'--1--1--1--1---1-- --1--i--.1--1--i-- __ ..._ --· ------ 
0 
1 
. 1 
2 
1 
1 
---------------i----------1- 
p 
1 
1 
2 


Q -------------------r--------1- 
3 
1 
31 
521 
2 
2" 
1 
'---------------------------1- 


R 
1 
1 
. 
1 
1 
...._ _______________________ _ 
s 
1 
1 
1 


l--J-..-1--1--1---------~~- 


T 
1 
4 
2 
1 
2 
'-------------------------- 
u 
1 
1 
v L--1--1------------------------- 
w 1--1--i--1·--·1--1---1--ll-·l--l--1--l-·l--l----J--------- 
x 
1 
2 
1 


---1--1--1--1--1--1--1--l·-·l--1--1--1--1--1--I·- - 
------ - 
y ------------------------- 
z 
2 


FIGUH 22. 


(6) The appearance of the foregoing distribution for this message is quite characteristic of 
that for a digraphic substitution cipher. There are many blank cells; although there are many 
cases in which a digraph appears only once, there are quite a. few in which a digraph appears 
two or three times, four cases in which a. digraph appears four times, and two cases in which a 
digraph appears five times. The absence of the letter J is also noted; this is often the case in a 
digraphic system based upon a checkerboard design, 


• The distinction between "digraphic" and "biliteral" is based upon the following consideration. In a 
biliteral (or diliteral) distribution every two successive letters of the text would be grouped together to form a 
pair. For example, a biliteral distribution of ABCDEF would tabulate thepairaAB, BC, CD, DE, and EF. In a 
digraphio distribution only successive pairs of the text are tabulated. For example, ABCDEF would yield only 
AB, CD, and EF. 
l 
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(7) In another common type of checkerboard system known as the Pla.yfa.ir cipher, described 


in Par. 46, one of the telltale indicatio~ besides the absence of the letter J is the absence of double 
letters, that is, two successive identical letters. The occurrence of the double letters · GG, HH, 
and QQ in the message under investigation elimlliates the possibility of its being a. Playfa.ir 
cipher. The simplest thing to assume is that a 4-square checkerboard is involved. One with 
normal alphabets in Sections 1 and 2 is therefore set down (Fig. 23a). 


A 
B c 
D 
E 
------ 
. ' 
....,....._ 


F 
G 
H I-J K 
_.__ 
-- -,.--- 


1 
L 
M 
N 
0 
p 
3 
---- 
-- 
- 
Q 
R s 
T 
u 
------ 
-- 
- 
v w x y 
z 
- - - - 
- --,_ 
A 
B c 
D 
E 
- 
- 
- 
----- 
F 
G 
H I-J K 
- 
- 
-- 
----- 
4 
L 
M 
N 
0 
p 
2 
- 
---- 
- --- 
Q 
R s 
T 
u 
- 
- 
--- 
- 
'--- - --- 
- 
v w x y 
z 


F!GUllB 2311 • 


(8) The recurrence of the group QMLF, three times, and at intervals suggesting that it might 
be a sentence separator, leads to the assumption that it is the word STOP. The letters Q, M, ~ 
and F are therefore inserted in the appropriate cells in Sections 3 and 4 of the .diagram. Th,ua 
(Fig. 23b): 


A 
B c D 
E 
- 
- 
---- -;-- 


F 
G 
H I-J K 
- 
-- 
--- - 
1 
L 
M N 
0 
p 
L 
3 


~ -- 
----,_ 


Q R s 
T u 
Q 
------ 
----i---- 


v w x y z 
- ----- 
- 
A 
B c 
D 
E 
- 
---- 
----,_ 


F 
G 
H I-J K 
- 
---- 
----,_ 


4 
F 
L 
M N 
0 
p 
2 
- ---- 
~ 
--1- 
M 
Q 
R s 
T u 
- ---- 
----- 
v w x y z 


FIGURll 23b. 
148274-38--6 


I 
I 
I 


f 
! 


I~ ' 


I 


I 
-I 
i 


l 
'I 


_) 
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These placements seem logical. Moreover, in Section 3 the number of cells between L and 
Q is just one less than enough to contain all the letters M to P, inclusive, and suggests that either 
N or O is in the keyword portion of the sequence, that is, near the top of Section 3. Without 
making a commitment in the matter, suppose both N and O, for the present, be inserted in the 
cell between M and P. Thus (Fig. 23c): 


A 
B c 
D 
E 
,_ - 
- 
- 
,_ 
F 
G 
H I-J K 
,_ - 
- 
- 
,_ 
1 
L 
M N 
0 
p 
L 
3 
- 
- 
- 
--- 
- 
,_ 
Q 
R s 
T u 
M n p 
Q 
- 
- 
- 
- 
,_ - 
- 
- 
,_ 
v w x y z 


,___ - - 
- - - 
,_ 
A 
B c 
D 
E 


~ 
- 
- 
- 
1- 


F 
G 
H I-J K 
- - ---- 
- 
- 
1- 


4 
F 
L 
M N 
0 
p 
2 
- 
- 
- 
- 
- 
,_ 
M 
Q 
R s 
T 
u 
- 
- -- 
- 
- 
- 
v w x y z 


(9) Now, if the placement of Pin Section 3 is correct, the cipher equivalent of THP will be 


P00 , and there should be a group of adequate frequency to correspond. Noting that PN0 occurs 
three times, it is assumed to be '!'Hp and the letter N is inserted in the appropriate cell in Section 4. 
Thus (Fig. 23d): 


A 
B c D 
E 
- 
- 
- 
- 
F 
G 
H I-J K 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
1 
L 
M N 
0 
p 
L 
3 
- 
- 
Q R s 
T 
u M 
M p 
Q 
,___ - 
- 
- 
,_ - 
- 
- 
- ,_ 
v w x 
y z 
- - - ,_ 


A 
B c 
D 
E 
,____ - 
- 
- 
- 
,_ - 
- 
- 
,_ 


N 
F 
G 
H I-J K 
,_ - 
- 
- 
- 
- 
- 
- 
- 
,_ 
4 
F 
L 
M N 
0 
p 
2 
- 
,_ 
M 
Q R s 
T u 
,___ - 
- 
- 
- 
,_ - 
- 
- 
,_ 
v w x y z 


J'IGUBJ: 234. 
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(10) lt is about time to try out these assumed values in the message. The proper insertions 
are made, with the following results: 


2 
I 
c 
Ii 
0 
7 
8 
II 
10 
11 
12 
lJ 
H 
111 


A. 
HF 
CA 
PG 
OQ 
IL BS 
PK 
MN. DU 
KE 
OH 
QN 
FB 
o~ UN 


B. 
QC 
LC 
HQ 
BQ 
BF 
HM 
AF 
XS 
IO 
KO 
QY 
FN sx 
MC 
GY 


c. XI m ~ AF 
DX 
LP 
MX 
HH 
RG 
KG 
QK 
QM 
~F Eg 
QI> 
ST 'OP 


D. 1 GO 
IH 
MU 
EO 
RD 
CL 
TtJ rn· 
QQ 
CG 
QN 
HF 
XI 
FB 
EX 


E. 
FL 
BU 
QF 
CH 
QO 
QM 
AF 
TX 
SY 
CB 
EP 
FN 
BS 
PK 
NU 


ST 


F. 
~ TX ro 
QM 
~ ~ ~ 00 IB 
~ ~ ll n 
IT IB 
ST 
OP 


G. rn 
~ ro 
~ oo 
~ rn 
~ HM 
HF m m ro 
~ ~ 


TH 
H. 
~ ~ ~ ffi FN 
~ H 
~ ~ ~ ~ ~ IT m n. 


TH 
TH 
J. 1 ZL 
QC 
IA 
AI 
QU 
CH 
TP 
CB 
IF 
GW 
KF 
CQ 
SL 
QM 
CB 


ST 
K. 
OY 
CR 
QQ 
DP 
RX 
FN 
QM 
LF 
ID 
GC 
CG 
IO 
GO 
IH HF 
ST 
OP 


L. 
IR 
CG 
GG 
ND 
LN 
OZ 
TF 
GE 
ER 
RP 
IF 
HO 
TF m 
AY 


M. I ZL 
gc 
IA 
AI gu 
CH 
TP 


(11) So far no impossible combinations are in evidence. Beginning with group H4 in the 
message is ~ 
the following sequence: 


PNFNPN 
TH .• TH 


Assume it to be THAT THE. Then ATp=FN11, and the letter N is to be inserted in row 4 column 1. 
But this is inconsistent with previous assumptions, since N in Section 4 has already been tenta;. 
tively placed in row 2 column 4 of Section 4. Other assumptions for FN0 are made: that it is, 
ISP (THIS TH ••• ) ; that it is ENP (THEN TH .•. ) ; but the same inconsistency is apparent. In fact 
the student will see that FN11 must represent a digraph ending in F, G, H, I...,.J, or K, since N0 is 
tentatively located on the same line as these letters in Section 2. 
Now FN0 occurs 4 times in 
the message. The digraph it represents must be one of the following: 


DF, oo. DH, DI, DJ. DK 
IF, IG, IH, II, IJ., IK 
JF, JG, JH, JI, JJ, JK 
OF, OG, OH, OI, OJ, OK 
TK, 
YF, YG, YH, YI, YJ, YK 


Of these the only one likely to be repeated 4 times is OF, yielding T H 0 F 'l' H which may be 


PNFNPN 


a part of 


• NORTHOFTHE. 
CQLQPNFNPNIT 
• S 0 UT H 0 FT H.E • 


or C Q L Q P N F N P N I T 


In either case, the position of the F in Section 3 is excellent: F • • • L in row 3. There are 3 
cells intervening between F and L, into which G, H, I-J, and K may be inserted. It is not nearly 
so likely that G, H, and K are in the keyword as that I should be in it. Let it be assumed that 
this is the case, and let the letters be placed in the appropriate cells in Section 3. Thus (Fig. 23e}: 


A 
B c 
D 
E 


i---- 
i--- 
- 
- 
- 
F 
G 
H I-J K 


i---- 
- 
- 
,_ 
1 
L 
M 
N 
0 
p 
F 
G 
H 
K 
L. 
3 


i---- 
- 
- 
...___ 


Q 
R s 
T u 
M 
N 
p 
Q 
0 
i---v w x 
y 
z 
- - - 
1- 


A 
B c 
D 
E 
,____ - 
- 
- - 
N 
F 
G 
H I-J K 


i--- - 
- 
,_ 


4 
F 
L 
M 
N 
0 
p 
2 


i---- 
- 
- -· .._ - - - 
i--- 
M 
Q 
Q 
R s 
T u 
---- - 
- 
- 
- 
- - 
- 
- 
- 
v w x y z 


Let the resultant derived v .1.lues be checked against the frequency distribution. If .the position of 
Hin Section 3 is correct, then the digraph ONp, normally of high frequency should be represented 
several times by HF0 • 
Reference to Fig. 22 shows a frequency of 4 times. · And HMc, with 2 occur- 
rences, represents NSp. 
There is no need to go through all the possible corroborations. 


(12) Going back to the assumption that TH . • TH 


PNFNPN 


is part of the expression 


• N 0 R T H 0 F T H E 
C Q L Q P N F N P N I T 
or 


.• SOUTHOFTHE •.• 
C Q L Q P N F N P N I T 


it is seen at once from Fig. 23e that the latter is apparently correct and not the former, because 
L~ equals OUp and not ORP. If 05p=CQ°' this means that the letter C of the digraph C~ must be 
placed in row 1 column 3 or row 2 ooh.min 3 of Section 3. Now the digraph CB0 occurs 5 times, 
CG0 , 4 times, CH0 , 3 times, CQ0 , 2 times. Let an attempt be made to deduce the exact position of 
C in Section 3 and the positions of B, G, and H in Section 4. Since F is already placed in Section 
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4, assume G and H directly follow it, and that B comes before it. How much before? Suppose a 
trial be made. Thus (Fig. 23.f): 


1 


4 


A 


F 


L 


Q 
v 


B 


H 


B 
- 
G 
- 


M 


R 
-w - 
- 
- 
B - 
- 


c 
- 
H 
-N 
s 
-x 
- 


- 


- 
B 
-M 
- 


D 
E 
- 
- 
I-J K 
- 
- 


0 
p 
F 


T 
u M 


- 
- 
y z 
- - 
A 
- 
- 
N 
F 
- 
- 
F 
G 
L 
- 
- 
Q 
Q 
- 
- 
v 
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c 
--- 
- 
c 
-- 
- 
G 
H 
K 
L 
3 
- 
- - 
8 
p 
Q 
--- 
--- 
B c 
D 
E 
--- 
G 
H I-J K 
--- 
M N 
0 ? 
2 
--- 
R s 
T 
u 
--- 
w x y z 


By re(erring now to the frequency distribution, Fig. 22, after a very few minutes of experimenta- 
tion it becomes apparent that the following is correct: 


A 
B c 
D 
- 
- 
- 
F 
G 
H I-J 


- 
- 
- 
1 
L 
M N 
0 


- 
- 
.. 
Q 
R s 
T 
- 
- 
- 
v 
W. x 
y 
- - - 
- 
-N 


4 
B 
F 
- 
- 
H 
M Q 
- 
- 
- 
- 


E 


K 


p 


u 
z 


G 


c 
- 
- 


- 
- 
- 
- 
F 
G 
H 
K 
L 
- 
- 
- 
M 
N 
p 
Q 
0 
- 
- 
- 
- 
- - - - 
A 
B c 
D 
E 


F 
G 
H I-J K 


L 
M N 0 
P 


Q R 
S 
T 
U 
v w x y z 
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.·. . (13) The identification& given by these placements Q.re inserted in the text, and solution 
is very rapidly completed. The final checkerboard and deciphered text a.re given below. 


A 
B c 
D 
E s 
0 
c 
I 
E 
- -- 
- 
- 
- 
F 
G 
H I-J K 
T 
y 
A 
B 
D 
- 
- 
- 
- 
- 
- 
1 
L 
M 
N 
0 
p 
F 
G 
H 
K 
L 
- 
- - - 
I~ - 
- 
- 
- 
,_ 3 
Q 
R 
S. 
T u M N 
p 
Q R 
- 
- 
-.-. - 
- - 
I~ 
v w x y 
z u v w x z 


i--.-- - - 
- - - 
1- 
E x 
p 
u 
L 
A 
B c 
D 
E 
- - - - 
- 
- - 
s 
I 
0 
N 
A 
F 
G 
H I-J K 
- 
- 
- --- 
- - - 
- 
- 
B c 
D 
F 
G 
L 
M 
N 
0 
p 
- 
- 
- 
- 
- - 
.H 
K 
M 
Q R 
Q 
R s 
T u 
' 4 ---- - - 
- - 
·-T v w y 
,,z. v .w x 
y 
z 


' 
PlOVQ28A. 
A.HFCAP G o·Q IL BSPKM NDUKE 0 HQ NF BORUN 
ONEHU ND RED FIRST FIELD ART IL LERYF 
B. QC LC H QB QB F 
H'M AF X S I. O K () 
Q -~ F N'S ··x M c·(ty 


R 0 M P 0 
S I T I 0 
N S t N' V 1 C I N I 'T Y O F B A R L 0 W 
C. XI F BE XAFDX .LP M X.H H .. R G K G QKQML F E Q Q I 
W I L L B E I N G E NERAL s u p p 0 RT ST 0 
PD UR I 
D. G 0 IHM U E 0 RD CL TU F 
E Q QC G QNHFX IF BEX 
NG AT T ACK SP EC I AL ATTEN TI 0 NW 
I L L B E 
E. FL BU Q F CH Q 0 QM AFT. XSYCB EPFNB SPKNU 
PA IDT 0 ,A S S I STING AD VAN CE 0 FF IRS TB 
F. Q I TX E UQ.MLF E Q QI G 0 IE U E HP I AN YTFLB 
RIG AD EST 0 P DU R·I N GAD VA NC EI T WILL P 


G. FEE P I 
DHPCG NQIHB FHMHF X CK UP D G Q P N 
L A C E C 0 NC EN TR AT I 
0 NS 0 N W 0 0 D S NORTH 


H. CBC Q L QPNFN. P NIT 0 
R.T ENC CBC NT FHHAY 
ANDS 0 
UT H 0 F THAYE RFARM AN DH I 
LL SIX 


J. Z L Q C I 
A A IQ U CH T PC B I F G W K F C Q S 
L QM CB 
Z ER 0 E 
I G HT D ASHA A ND ON W 0 0 D SE AST AN 


K.OYCRQ QDPRX FNQML F I D G C C G I 0 G OIHHF 
D WEST THERE 0 F S T 0 PC 0 MM ENC IN GATON 


L. IR CG G GNDLN 0 Z T F G EERRP IF H 0 T FHHAY 
ET EN P MS M 0 K E W I L L BEUSE D 0 N H I 
LL SIX 


M. Z L QC I 
A A IQ U CH T P 


Z ER 0 E 
I G HT D ASHA 


i 
·I 


d. (1) It is interesting to note how much simpler the matter becomes when the positions 


of the plain-text and cipher-text sections a.re reversed, or, what a.mounts to the same thing, 
when in encipherment the plain-text pairs are sought in the sections containing the mixed alpha- 
bets, and their cipher equivalents are taken from the sections containing the normal alphabets. 
For example, referring to Fig. 23h, suppose that sections 3-4 be used as the source of the plain- 
text pairs, and sections 1-2 as the source of the cipher-text pairs. Then ONp=DG., Elip=AU., etc. 
(2) To solve a message enciphered in that manner, it is necessary merely to make a square 


in which all four sections are normal alphabets, and then perform two steps. First, the cipher text 
pairs a.re converted into their normal alpha.bet equivalents merely by "deciphering" the message 
with that square; the result of this operation yields two monoalphabets, one composed: of the odd 
letters, .the other of the even letters. The second step is to solve these two mon~alphabeta. 


(3) Where the same mixed alphabet is inserted in sections 3 and 4, the problem iS still 
easier, since the letters resulting from the conversion into normal-alphabet equivalents all belong 
to the same, single-mixed alpha.bet. 


46. Analysis of ciphers based upon other types of checkerboard .deaign.1.-The solution 


of cryptograms enciphered by other types of checkerboard designs is a.ceoinplished along lines 
very similar to those set forth in the foregoing example of the solution of a message prepared by 
means of a 4-square checkerboard design. There are, unfortunately, no means or tests which cab. 
be applied to determine in the early stages of the analysis exactly what type of design is involved 
in the.first case under study. The author freely admits that the solution outlined in subparagraph 
c is quite artificial in that nothing is demonstrated in step (7) that obviously leads to or warrants 
the assumption that a 4-square checkerboard is involved. This p()int was passed over with the 
quite bald statement that this was "the simplest thing to a.ssum.e"--and then the solution 
proceeds exactly as though this mere hypothuis has been definitely established. · For~eample, the 
vecy first results obtained were based upon assuming that a certain Hatter repetition represented 
the word STOP and immediately inserting certain letters in approprWle cells in a 4...,quar.e. checker- 
board. Several more assumptions were built on top of that and very rapid strides were made. 
What if it had not been a 4-square checkerboard at all? What if it had been a 2-square checker- 
board of the type shown in Fig. 24? 


M 
A 
N u F 
0 s 
Q 
L 
p 
- 
- 
- 
- 
- 
- 
c 
T 
R 
I 
G w .z 
y v x 
- 
- 
- 
- 
- 
- 
B 
D 
E 
H 
K 
D 
k 
H 
B 
E 
- 
- 
- 
- 
- ·- 
L 
0 
p 
Q s 
A 
F u M 
N 
- 
- 
- 
,_ - 
- 
v w x 
y 
z 
T 
G 
I 
c 
R 


The only defense that can be me.de of what may seem to the student to be purely arbitrary 
procedure based upon the author's advance information or knowledge is the following: In the 
first place, in order to avoid making the explanation a too-long-drawn-out affair, it is necessary 
(and pedagogical experience warrants) that certain alternative hypotheses be passed over in 
silence. In the second place, it may now be added, after the principles and procedure have been 
elucidated (which at this stage is the primary object of this text) that if good results do not follow 
from a first hypothesis, the only thing the cryptanalyst can do is to reject that hypothesis, and 
formulate a second hypothesis. In actual practice he may have to reject a second, third, fourth, 
. . . nth hypothesis. In the end he may strike the right one-or he may not. There is no 
guaranty of success in the matter. In the third place, one of the objects of this text is to show 
how certain systems, if employed for militacy purposes, can readily be broken down. Assuming 


that a checkerboard system is in use, and that daily changes in keywords are ma.de, it is possible 
that the traffic of the first day might give considerable difficulty in solution, if the type of 
checkerboard were not known to the cryptanalyst. But the second or third day's traffic would 
be easy to solve, because by that time the cryptanalytic personnel would have analyzed the 
system and thus lea.med what type of checkerboard the enemy is using. 


46. Analysis of the Playfair cipher system.-a. An excellent example of a practical, partially 
digraphic system is the Playfair cipher.6 It was used for a number of years as a field cipher by 
the British .Army, before and during the World War, and for a short time, also during that 
war, by certain units of the American Expeditionary Forces. 
b. Published solutions 7 for this cipher are quite similar basically and vary only in minor 
details. The earliest, that by Lieut. Mauborgne, used straightforward principles of frequency to 
establish the values of three or four of the most frequent digraphs. Then, on the assumption 
that in most cases in which a keyword appears on the first and second rows the last five letters 
of the normal alphabet, VWXYZ, will rarely be disturbed in sequence and will occupy the last row 
of the square, he "juggles" the letters given by the -values tentatively established from frequency 
considerations, pl~ing them in various positions in the square, together with VWXYZ, to correspond 
to the plain-text cipher relationships tentatively esiablished. A later solution by Lieut. Frank 
Moorman, as described in Hitt's Manual, assumes that iri. a Playfair cipher prepared by means 
of a square in which the keyword occupies the first and second rows, if a digraphic frequency 
distribution is made, it will be found that the letters having the greatest combining power are 
very probably letters of the key. A still later solution, by Lieut. Commander Smith, is perhaps 
the Dlo8t lucid and systematized of the three. He sets forth in definite language certain con- 
siderations which the other two writen certainly entertained but failed to indicate. . 


t~ The following details have been summarized from Commander Smith's solution: 
(1) The Playfair cipher may be recognized by virtue of the fact that it always contains an 
even number of letters, and that when divided into groups of two letters each, no group contains 
a repetition of the same letter, as NN or EE. 
Repetitions ·of digraphs, trigraphs, and polygraphs 


will be evident in fairly long tti~sages. 


(2) Using the square 8 shown in Fig. 25a, there are two general cases to be considered, as 


regards the results of enc,iph1mnent: 


. 
l 
• 


B 
A 
N 
K 
R 
,______ --- 
- 
- 
D 
E 
F 
G 
H 
._ --- 
- 
- 
I-J L 
M 
0 
Q 
~ 
- 
- -- - 
u 
p 
T c 
y 
- 
- 
- --,_ 
s v w x z 


Fl0l1U 2/la. 


•This clpher was really invented by Sir Charles Wheatstone but receives its name from Lord Playfair, 
who apparently was its sponsor before the British Foreign Office. See Wemyss Reid, Memoirs of Lyon Playjair, 
London, 1899. A detailed description of this cipher will be found in Sec. VIII, Advanced Military Cryptography. 


7 Mauborgne, Lieut. J. 0., U.S. A. 
An advanced problem in cryptography and its solution, Leavenworth, 1914. 
Hitt, Captain Parker, U. S. A. 
Manual for the solution of military ciphers, Leavenworth, 1918. 


Smith, Lieut. Commander W. W., U. S. N. In Cryptography by Andre Langie, translated by J. C. H. 
Macbeth, New York, 1922. 
8 The Playfair square accompanying Commander Smith's solution is based upon the keyword BANKRUPTCY, 
"to be distributed between the first and fourth lines of the square." This is a simple departure from the original 
Playfair scheme in which the letters of tb11 keyword are written from left to right and in consecutive lines from 
the top downward. 


~- 
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CASE 1. Letters at opposite corners of a rectangle. The following illustrative relationships 
are found: 


Tffp=YFo 
HTi1=FY0 
YFrTHe 


FY11=HT0 


Reciprocity is complete. 
CAsE 2. Two letters in the same line or column. The following illustrative relationships 
are found: 


ANir=NKo 
NAir=KNo 


But NKp does not=ANo, nor does KN11=NA •. 
Reciprocity is only partial. 
(3) The foregoing gives rise to the following: 
RULE I. (a) Regardless of the position of the letters in the square, if 


1.2=3.4, then 
2.1=4.3 


(b) If 1and2 form opposite corners of a rectangle, the following equations obtain: 


1.2=3.4 
2.1=4.3 
3.4=1.2 
4.3=2.l 


(4) A letter considered as occupying a position in a row can be combined with but four other 
letters in the same row; the same letter considered as occupying a position in a column can be 
combined with but four other letters in the same column. Thus, this letter can be combined with 
only 8 other letters all told, under Case 2, above. But the same letter considered as occupying 
a corner of a rectangle can be combined with 16 other letters, under Case 1, above. Commander 
Smith derives from these facts the conclusion that "it would appear that Case 1 is twice as prob- 
able as Case 2." He continues thus (notation my own): 


"Now in the square, note that: 


ANir=NK. 
GNir=FKo 
ONir=MKo 
CNp=TKo 
XNir=WK. 


also 


ENir=FA. 
EMir=FL0 
ETir=FP0 
EWir=FV0 
EFrFG0 


"From this it is seen that of the 24 equations that can be formed when each letter of the 
square is employed either as the initial or final letter of the group, five will indicate a repetition of 
a corresponding letter of plain text. 


"Hence, RULE II. Afterithas been determined, in the equation 1.2=3.4, that,say,ENp=FA., 
there is a probability of one in five that any other group beginning with F o indicates E0p, and that 
any group ending in A0 indicates 0N11• 
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"After such combinatii:>ns as ERp, ORp, and ENp have been assUm.ed or determined, the above 
rule may be of use in discovering additional digraphs and partial words." 9 


RuLE III. In the equation 1.2=3.4, 1and3 can never be identical, nor can 2 and 4 ever be 
identical. Thus ANp could not possibly be represented by AY 0, nor could ERP be represented by KRo. 
This rule is usef~l in elimination of certain possibilities when a specific message is being studied. 
RuLE IV. In the equation l.2p=3.4e, if 2 and 3 are identical, the letters are all in the same 
row or column, and in the relative order 124. In the square sho~, ANp=NK0 and the- absolute 
order is ANK .. The relative order 124 includes five absolute orders which are cyclic permutations 
of one another. Thus: ANK .. , NK .. A, K •• AN, •. ANK, and .ANK .. 
RULE V. In the equation l.2p=3.4 0 , if 1 and 4 .are identical, the letters are all in the same 
row or column, and in the relative order 243. In the square shown, KNp=RKo and the absolute 
order is NKR. 
The relative order 243 includes five absolute orders which are cyclic permutations 
of one another. Thus NKR •• , KR .. N, R .. NK, .. NKR, and • NKR •• 


RULE VI. "Analyze the message for group recurrences. Select the groups of greatest 
recurrence and assume them to be high-frequency digraphs. Substitute the assumed digraphs 
throughout the message, testing the assumptions in their relation to other groups of the cipher. 
The re-construction of the square proceeds simultaneously with the solution of the message and 
aids in hastening the translation of the cipher." 


d. (1) When solutions for the Playfair Cipher system were first developed, based upon the 
fact that .the letters were inserted' in the cells in keyword-mixed .order, cryptographers thought 
it desirable to place stumbling blocks in the path of such solution by departing from strict, 
keyword-mixed order. Playfair squares of the latter type a.re designed as "modified Playfair 
squares." One of the simplest methods is illustrated in Fig. 25a, wherein it will be noted that 
the last five letters of the keyword proper are 'inserted in the fourth row of the square instead 
of the second, where they would naturally fall. Another method is to insert the letters within 
the cells from left to right and top downward· "Qut use a sequence that is a keyword-mixed sequence 
developed by a columnar transposition based upon the keyword proper. Thus, using the key- 
word BANKRUPTCY: 


---- 


2 1 5 4 7 9 6 8 3 10 
BANKRUPTCY 
DEFGHILMOQ 
svwxz 
Sequence: A E V B D S C 0 K G X N F W P L R H Z T M U I Y Q 


• There is an error in this reasoning. Take, for example, the 24 equations having Fas an initial letter: 


Case 
1. FB0 =DN,, 
2. FD =EH 
1. FI =DM 
1. FU =DT 
1. FS =DW 
1. FA =EN 


Case 
2. FE=ED 
1. FL=EM 
1. FP=ET 
1. FV=EW 
2. FN=NW 
2. FM=NF 


Case 


2. FT=Nll 
2. FW=NT 
1. FK=GN 
2. FG=EF 
1. FO=GM 
1. FC=GT 


Case 


1. FX=GW 
1. F'R=HN 
2. FH=EG 
1. FQ=HM 
1. FY=HT 
1. FZ=HW 


Here, the initial letter F 0 represents the following initial letters of plain-text digraphs: 
DENGH 
It is seen that F0 represents D,,, N,,. G,,, H,, 4 times each, and E,,, 8 times. Consequently, supposing that it has 
been determined that FA.=EN,,, the probability that F0 will represent E,, is not 1 in 5 but 8 in 24, or 1 in 3; but 
supposing that it has been determined that FW0 =NT"' the probability that F 0 will represent N" is 4 in 24 or 1 in 6. 
The difference in these probabilities is occasioned by the fact that the first instance, FA0 =EN,, corresponds to a 
Case 1 encipherment, the second instance, FW.=NT "' to a Case 2 encipherment. But there is no way of knowing 
initially, and without other data, whether one is dealing with a Case 1 or Case 2 encipherment. Only as an 
approximation, therefore, may one say that the probability of F 0 representing a given 0,, is 1 in 5. 


·The Playfair Square is as folloirs: 


A 
E 
v 
B 
D 
s c 
0 
K 
G 
------- 
x 
N 
F w p 
-------- 
L 
R 
H z 
T 


M u 
I 
y 
Q 


FIOUBll 2116. 


(2) In the foregoing square practically all indications that the square has been developed 
from a keyword have disappeared. The principal diSa.dvantage of such an arrangement is that 
it requires more time to locate the letters desired, both in cryptographing and de-c'ryptogtaphing, 
than it usually does when a semblance of normal alphabetic order is preserved in the square. 
(3) Note the following three squares: 


z 
T 
L 
R 
H 
0 
K 
G s c 
N 
F w p x 
- -------- 
t----------- 
------ 
y 
·Q 
M u 
I 
H z T· L 'R 
R 
H z 
T 
L 
,____.. 
----,_ 
------- 
--- -- 
B 
D 
A 
E v 


"• 
•" 
v 
B 
D 
A 
E 
u 
I 
y 
Q 
M 
! 


K 
.. G. S. c 
0 
F w p x .N 
E v 
B 
D 
.A 
~ -p-rx- - 
- 
w 
N 
F 


t--- 


I 
y 
Q 
M u 
------ 
c 
0 
K 
G s 


FIOUBll2M. 
hlvli••· 


=I 
-1 


i 
-l 
' 
i: 


?1 


l 
i 
i 
l 
! 
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At first glance they all appear to be different, but closer examination shows them tO be cyclic 
permutations of one another and of the square in Fig. 25b. They yield identical equivalents in 
all cases. However, if an attempt be made to reconstruct the oriyinal keyword, it would be 
much easier to do so from Fig. 25b than from any of the others, because in Fig. 25b the keyword- 
mixed sequence has not been disturbed as much as in Figs. 25c, d, '· In working with Playfair 
ciphers, the student should be on the lookout for such instances of cyclic permutation of the 
original Playfair square, for during the course of solution he will not know whether he is building 
up the original or an equivalent cyclic permutation of the original square; only after he has 
completely reconstructed the square will he be able to determine this point. 
(4) It can readily be shown that the columns of a Playfair square may be cyclically permuted 
(see subpar. d) to produce a first set of 25 squares all of which, though at first glance apparently 
different, will yield identical equivalents; likewise, the rows of such a square may be cyclically 
permuted to produce a see<ond set of 25 squares all of which will also yield identical equivalents. 
Thus :there may be .a total of 50 cyclic permutations composed of two sets of 25 each. The 
ciph,er equivalents. yielded by Case 2 encipherments (letters in the same row or in the same 
col\lillll) will be. id~ntical for any two of these 50 different Playfair squares; but the cipher equiv- 
alents yielded by Case 1 encipherments (letters at diagonally opposite corners of a rectangle) 
will be identical only for two squ~es belonging to the same set of 25 cyclic permutations. 
'· <H !'h.e s,teps in the solution of a typical example of this cipher may be useful. Let 
the meda.ge be It& follows: 
· 


-~'"''' 
I 
7 
I 
9 10 
11 11 ll 14 ll 
JI 17 11 19 ., 
11 21 211 • 
:Ill 
• 
27 211 
211 10 
A:. V ,T Q EU HI 0 FT CH X ~ C AKTVT RAZEV TA GAE 
B. 0 x· T Y 11 HCRLZ ZTQTD UMCYC XCTGM TYCZU 


C. SN 0 PD GXVXS CAKTV T PK PU T Z PT W ZFNBG 


-- 
D. PT R K X IX BP R Z 0 E PU T 0 L Z E KT TC S NHCQM 


E. VT R KM WC F Z U B HT VY A B G I P R Z KP C QFNLV 


F. 0 X 0 TU Z FA C X x c p z x H CY N 0 TY 0 LG XX I I H 


G. TM SM X 
C P T 0 T 
C X 0 TT CY ATE X HF AC x x c p z .. 


H . ..!.J! Y C T XWLZT S GP Z T VY WC E 
T W G CC MBHMQ 
J. y x z p w GR TI V UXPUM QRKMW CXTMR SW G H B 


K. X C PTO T C X 0 T MI PY D NF GK I 
T C 0 L X UETPX 
L.XFSRS U Z TD B 
H 0 Z I G XRKIX z pp v z ID UH Q 
M. 0 T KT K 
CC H XX 
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(2) Without going through the l>reliminary tests in detail, with which it will be aailUm.ed. 


that the student is now familiar,1° the conclusion is reached that the cryptogram is digraphic in 
nature, and a digraphic frequency distribution is made (Fig. 26). 
A B C D E F G H I K L M N 0 P Q R S T U V W ~ Y Z 


A 
B 
c 


D 
E 
F 
G 
H 
I 
K 
L 
M 
N 
0 
p 


Q 


R 
s 


T 
u 
v 
w 
x 
y 
z 


- 


2 
.. 
- 


2 - 


---- 
,_ 


>--- 


1 - 
- 


- 


- 
-2 - 


- 
- 


1 


- - 
1 


1 
1 


1 


- -1 


- 
- - 
3 


- - 
1 
5 
- - 
3 
- - 


10 See Par. 44c. 


1 
- - - 


I 
J 


- - - 


1- 


1 


- - - 
1 


- 
- -1 
- - - 
- - - 
2 


1 - 
- - 
2 
2 


2 
1 


1 
1 


1 
2 


- - - - 
- 
1 


----- ,_ - - - 
- 
1 
1 
,_ 
1 


,_ 


1 
1 
-1 


3 


1 
- 
1 


4 - 
- 
1 
1 
2 


] 
2 
- 
- - - - 
- - - 
1 
- - 
- 
- 
- 
- 
2 
1 
- 
- 
- 
3 
2 
- - - 
- - - - - 


,_ - 
- ----- - 
- 
- 
2 
1 


Frol1D211. 


- - - - - - - 
- 
- 
- 
- 


1 
1 
5 
1 
- - - - 
2 


1 
1 


1 
- - - - 
- 
2 - 
- 
1 
- - - - - - 
1 
1 
1 
,_ 


1 
4 


1 
1 
- - - - 
,_ 


2 
1 
2 
- 
- 


1 
6 
2 


1 
4 
2 
1 
1 
1 
3 
- 
- - - 
1 
- 
- 
1 
1 
1 
- 
- 
1 
] 
- 
,_ 
2 
1 
1 
2 
1 
- - - - 
- 
- - - 
- 
- 
- 
1 
1 
1 
- - - 
- - - - - - - 
- 
5 
- - - 
- - 
- - - 
- - 
- 
1 
- 
- - - 
- - 
- 
- 
- 
- - 
2 
1 
1 


1 
1 
- 
- - ,_ - ,_ 


2 
3 
2 
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(7) It is perhaps high time that the whole list of tentative equivalent values be studied in 
relation to their consistency with the positions of letters in the Playfair square; moreover, by so 
doing, additional values may be obtained in the process. The complete list of values is as follows: 


A11umed 11aluu 
Deri11ed by Rule I 


ATrCX. 
TArXC. 
LirPZ0 
ILp=ZP • 
ONrXHo 
NOrHXo 
THp=OT0 
HTp=T00 
IRrUZ. 
RirZU. 
DBrFA0 
BDp=AF • 
ECp=TE0 
CErET 0 
TErPT 0 
ETJFTP 0 
EirTC. 
IErCT. 
RSrYA. 
SRJFAY. 
-SJFSM. 
S-rMSa 


(8) By Rule V, the equation THp=OT0 means that H, T, and 0 are all in the same row or col- 
umn and in the relative order 2-4-3; similarly, C, E, and Ta.re in the same row or column and in the 
relative order 243. Further E, P, and T a.re in the same row and column, and their relative 
order is also 243. That is, these sequences must occur in the square: 


U) 
~) 


H T 0 • • , or 
C E T • • , or 
T 0 • • H , or 
E T • . C , or 


0 • . H T , or 
T • . C E , or 


• • H T 0 , or 
• • C E T , or 
.HTO. 
.CET. 


(3) 
E T P 
, or 
T P •• E , or 
P •• ET,or 
.. E T P , or 
• E T P . 


(9) Noting the common letters E and Tin the second and third sets of relative orders, these 


may be combined into one sequence of four letters. Only one position remains to be filled and 
noting, in the list of equivalents that EIP=TCc, it is obvious that the letter I belongs to the CET 
sequence. The complete sequence is therefore as follows: 
C E T P I , or 
E T P I C , or 
T P I C E , or 
PICET 
or 
I C E T P 


(10) Taking up the HTO sequence, it is noted, in the list of equivalents that ONp=XH0 , an 


equation containing two of the three letters of the HTO sequence. From this it follows that 
N and X must belong to the same row or column as HTO. The arrangement must be one of the 
following: 
HT 0 X N 
TO X NH 
0 X NH T 
X NH T 0 
NH T 0 X 


(11) Since the sequence containing HTOXN has a common letter (T) with the sequence 


CETPI, it follows that if the HTOXN sequence occupies a row, then the CETPI sequence must 
occupy a column; or, if the HTO sequence occupies a column, then the CETPI sequence must 
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occupy a row; and they may be combined by means of their common letter, T. According to 
subpar. d (4), the two sequences may be inserted within a Playfair square in 25 difierent ways 
by cyclically permuting and shifting the letters of one of these two sequences; and the same 
two sequences may be again inserted in another set of 25 ways by cyclically permuting and 
shifting the letters of the other of these two sequences. In Fig. 27 the diagrams labeled (1) 
to (10), inclusive, show 10 of the possible 25 obtainable by making the HTOXN sequence one 
of the rows of the square; diagrams (11) and (12) show 2 of the possible 25 obtainable by making 
the HTOXN sequence one of the columns of the square. The entire complement of 25 arrange- 
ments for each set may easily be drawn up by the student; space forbids their being completely 
set forth and it is really unnecessary to do so. 


(1) 
(2) 
(3) 
(4) 
c 
c 
c 
c 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
E 
E 
E 
E 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
T 
0 x 
N 
H 
H 
0 
X 
N 
N H T 0 
X 
X N H T 0 


- 
- 
- 
- 
----- 
p 
p 


I 
I 


(5) 
(6) 
(7) 
(8) 
c 
p 
p 
p 
- 
- 
- 
- 
- 
- 
E 
I 


0 
X N H T 
c 
----- 
- 
- 
- 
- 
- 
- 
p 
E 
E 
E 
- 
- 
- 
- 
- 
- 
I 
T 
0 x N 
H 
H 
T 
0 x 
N 
N 
H 
T 
0 x 


(9) 
(10) 
(11) 
(12) 


N 
N - 
- 
- 
H 
H 
- 
- 
- 
- 
- 
- 
- 
p 
I 
c 
E 
T 
p 
I 
c 
E 
- 
- 
- 
0 


X 
N 
H 
T 
0 
0 
X N H T 
X 
x 


FIGURB 'O. 


H82T4-8S--T 
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(12} Before trying to discover means whereby the actual or absolute arrangement may 
be detected from among -the full set of 50 possible arrangements, the question may be raised: is 
·it necessary? So far as concerns Case 2 encipherments, since any one of the 50 arrangements 
will yield the same equivalents as any of the remaining 49, perhaps a relative arrangement 
will do. 
(13) Let arrangement 8 be arbitrarily selected for trial. 


p 


<-------------- 
I 
---------- 
c 


I----------- 
E --- 
N 
H 
T 
0 
x 


FIOUB1:28a, 


(14) What additional letters can be inserted, using as a guide the list of equivalents in sub- 


paragraph (7)? There is ATp=CX0 , for example. It contains only one letter, A, not in the 
arrangement selected for trial, and this letter may immediately be placed, as shown:10 


p 


<------ 


I 


<-------------- 
c 
A 


<-------------- 
E 
'--------,_ 


N 
H 
T 
0 
x 


FIOUBJ: 2811. 


Scanning the list for additional cases of this type, none are found. But seeing that several high- 
frequency letters have already been inserted in the square, perhaps reference to the cryptogra;m 
itself in connection with values derived from these inserted letters may yield further clues. For 
example, the vowels A, E, I, and 0 are all in position, as are the very frequent consonants N.and T. 
The following combinations may be studied: 


ANp=0Xa 
ATp=CX0 
NAp=X00 
TAp=XC0 


ENp=8T0 
ETp=TP0 
NEp=T8 0 
TEp=PT0 


INp=8T0 
ITp=CP0 
Nip=T8 0 
Tip=PC0 


ONp=XH0 
OTP=XO. 
NOp=HX0 
TOp=OX0 


ATp(=CX0 ), TAp(=XC0 ), ONp(=XH.), TEp(=PT0 ) and ETp(=TP0 ) have already been inserted in the 
text. Of the others, only OX0 (=TOp) occurs two times, and this value can be at once inserted in 
the text. But can the equivalents of AN, EN, or IN be found from frequency considerations? 


10 The fact that the placement of A yields ATP=CX. means that the outline selected for experiment really 
belongs to the correct set of 25 possible cyclic permutations, and that the letters of the NHTOX sequence belong 
in a row, the letters of the PICET sequence belong in a column of the original Playfair square. If the reverse 
were the case, one could not obtain ATp=CX. but would obtain ATP=XC 0 • 
' 
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Take ENP, for example; it is represented by 0T0 • 
What combination of 0T is most likely to repre- 
sent ENp among the following candidates: 


KTc (4 times); by Rule I, NEp would=TK0 (no occurrences) 


VT0 (5 times); by Rule I, NEp would=TVc (2 times) 


ZTc (3 times); by Rule I, NEP would=TZ0 (1 time) 


VT0 certainly looks good: it begins the message, suggesting the word ENEMY; in line H, in the 
sequence PZTV would become LINE. Let this be assumed to be correct, and let the word ENEMY 
also be assumed to be correct. Then EMp=QE0 and the square then becomes as shown herewith: 


p 


<-------------- 
I 
---------- 
c 
A 


<-------------- 
v 
M 
E 
Q 


<-------------- 
N 
H 
T 
0 
x 


FIOUBI: 28e. 


(15) In line E is seen the following sequence: 


Line E: ••.•••• VT RK MW CF ZU BH TV YA BG IP RZ KP CQ FN LV 


EN 
RI 
NE RS 
PT 
E 


The sequence ... RI .. NERS .. PT ... suggests PRISONERS CAPTURED, as follows: 


MW CF ZU BH TV YA BG IP RZ KP 


P RI SO NE RS CA PT UR ED 


This gives the following new values: 0Pp=CF0 ; SOp=BH0 ; CAp=BG0 ; URp=RZc; EDp=KP0 • 


The letters B and G can be placed in position at once, since the positions of C and A are already 
known. The insertion of the letter B immediately permits the placement of the letter S, from the 
equation SOP=BH0 • 
Of the remaining equations only EDp=KP0 can be used. 
Since E and Pare 


fixed and are in the same column, D and K must be in the same column, and moreover the K must 
be in the same row as E. 
There is only one possible position for K, viz, immediately after Q. 
This 
automatically fixes the position of D. 
The square is now as shown herewith: 


p 
D 
---------- 
I 
---------- 
G s c 
B 
A 
---------- 
v 
M 
E 
Q 
K 
--------,_ 


N 
H 
T 
0 
x 


FIGUBE28d. 


---------- ______________________________________________ __.. _________ _ 


I I 
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. (16) A review of all equations, including the very first ones established, gives the following 
which may now be used: DBP=FAc; RSp=YAc. 
The first permits the immediate placement of F· 


the second, by elimination of possible positions, permits the placement of both R and Y. 
Th~ 


square is now as shown herewith: 


p 
F 
D 
- 
- 
- 
- 
y 
I 
R 
- 
- 
G s c 
B 
A 
- 
- 
- 
v M E 
Q K 
- 
- 
- 
N 
H 
T 
0 x 


J'Jotra• 28e. 


Once more a review is made of all remaining thus far unused equations. Lip=PZ0 now permits 
the pl~cement of L and Z. !Rp=UZc now permits the placement of U, which is confirmed by the 
equation URp=RZ0 from the word CAPTURED. 


L 
p 
F 
D 
z 
y 
I 
u 
R 
-G s c B 
A 
1- 


v M 
E 
Q 
K 


I-- - 
- 
- 
,_ 
N 
H 
T 
0 x 


Pion. 'JP/. 


There is then only one cell vacant, and it must be occupied by the only letter left unplaced 
viz; W. 
Thus the whole square has been reconstructed, and the message can now be decrypto: 


graphed. 
. 
(17) Is the squ~re just recc;>nstructed identical with the original, or is it a cyclic permuta- 
tion of a keyword-rmxed Playfair square of the type illustrated in Fig. 256? Even though the 
message can be read with ease, this point is still of interest. Let the sequence be written in five 
ways, each composed of five partial sequences made by cyclicly permuting each of the horizontal 
rows of the reconstructed square. Thus: 


Row 1 
Row 2 
Row 3 
Row 4 
Row 5 


(a) 
L WP FD Z YIU R GS CB A V ME Q K NH T 0 X 


(b) 
WP FD L YIU R Z S C B A G ME Q KV HT 0 X N 


(c) 
P F D L W I U R Z Y CB AG S 
E Q KV M T 0 X NH 


(d) 
FD L WP 
U R Z Y I 
BAGS C Q KV ME 0 X NH T 


(e) 
D L WP F 
R Z YIU AG SC B KV ME Q XNHTO 
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By experimenting with these five sequences, in an endeavor to reconstruct a transpoSition 
rectangle conformable to a keyword sequence, the last sequence yields the following: 


PYACMN 


DFIGBEH 
LRUSKQT 
w z 
v x 0 


By shifting the 0 from the last position to the first, and rearranging the columns, the following 
is obtained: 
2 5 3 6 1 4 7 
COMPANY 
BDEFGHI 
KLQRSTU 
v w x z 


The original square must have been this: 


A 
G s c 
B 
- 
- 
- 
K v M 
E 
Q 
- 
x N 
H 
T 
0 


- 
- 
D 
L w p 
F 
- 
- 
- 
R z 
y 
I 
u 


J'lovu 38t 


j. Continued practice in the solution of Playfair ciphers will make the student quite expert 
in the matter. and will enable him to solve shorter and shorter messages.11 
Also, with practice 


it will become a matter of indifference to him as to whether the letters are inserted in the square 
with any sort of regularity, such as simple keyword-mixed order, columnar transposed keyword- 
mixed order, or in a purely random order. 
g. It may perhaps seem to the student that the foregoing steps are somewhat too artificial, 
a bit too "cut and dried" in their accuracy ~o portray the process of analysis, as it is applied in 
practice. For example, the critical student may well object to some of the assumptions and the 
reasoning in step (5) above, in which the words THREE and ONE (1st hypothesis) were rejected 
in favor of the words THIRD and SECOND (2nd hypothesis). This rested largely upon the 
rejection of REP and ERp as the equivalents of UZ0 and ZUc, and the adoption of IRP and RIP as 
their equivalents. Indeed, if the student will examine the final message with a critical eye he 
will find that while the bit of reasoning in step (5) is perfectly logical, the assumption upon which 
it is based is in fact wrong, for it happens that in this case ERP occurs only once and REP does not 
occur at all. Consequently, although most of the reasoning which led to the rejection of the 1st 
hypothesis and the adoption of ·the 2nd was logical, it was in fact based upon erroneous assump- 


11 The author once had a student who "specialized" ln Playfair ciphers and became so adept that be could 
solve messages containing as few a.s 5~0 letters within 30 minutes. 
· 
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tion. In other words, despite the fact that the assumption was incorrect, a correct deduction 
was made. 
The studeni should t<ike note that in cryptanalysis situations of this sort are not at all wn- 
usual. Indeed they are to be expected and a few words of explanation at this point may be useful. 
h. Cryptanalysis is a science in which deduction, based upon observational data, plays a 


very large role. But it is also true that in this science most of the deductions usually rest upon 
assumptions. It is most often the case that the cryptanalyst is forced to make his assumptions 
upon a quite limited amount of text. It cannot be expected that assumptions based upon 
statistical generalizations will always hold true when applied to data comparatively very much 
smaller in quantity than the total data used to derive the generalized rules. 
Consequently, as 


regards assumptions made in specific messages, most of the time they will be correct, but occar 
sionally they will be incorrect. In cryptanalysis it is often found that among the correct deduc- 
tions there will be cases in which subsequently discovered facts do not bear out the assumptions 
on which the deduction was based. Indeed, it is sometimes true that if the facts had been known 
before the deduction was made, this knowledge would have prevented making the correct deduc- 
tion. For example, suppose the cryptanalyst had somehow or other divined that the message 
under consideration contained no RE, only one ER, one IR, and two RI's (as is actually the case). 
He would certainly not have been able to choose between the words THREE and ONE (1st hypo- 
thesis) as against THIRD and SECOND (2d hypothesis). But because he assumes that there 
should be more ERp's and REp's than IR's and RI's in the message, he deduces that UZ0 cannot 
be REP, rejects the 1st hypothesis and takes the 2d. It later turns out, after the problem has been 
solved, that the deduction was correct, although the assumption on which it was based (expectation 
of more frequent appearance of REP and ERp) was, in fact, not true in this particular case. The 
cryptanalyst can only hope that the number of times when his deductions are correct, even though 


- based upon assumptions which later tum out to ~ efiooneous, will abundantly exceed the num- 
ber of times when his deductions are wrong, even though based upon assmnptions which later 
prove to be correct. If he is lucky, the making of an assumption which is really not true will 
make no difference in the end and will not delay solution; but if he is specially favored with 
luck, it may actually help him solve the message-as was the case in this particular example. 
i. Another comment of a general nature may be made in connection with this specific 
example. The student may ask what would have been the procedure in this case if the message 
had not contained such a tell-tale repetition as the word BATTALION, which formed the point 
of departure for the solution, or, as it is often said, permitted an "entering wedge" to be driven 
into the message. The answer to his query is that if the word BATTALION had not been repeated, 
there would probably have been some other repetition which wou:Id have permitted the same 
sort of attack. If the student is looking for cut and dried, straight-forward, unvarying methods 
of attack, he should remember that cryptanalysis, while it may be considered a branch of mathe- 
matics, is not a science which has many "general solutions" such as are found and expected in 
I 


mathematics proper. It is inherent in the very nature of cryptanalytics that, as a rule, only 
general principles can be established; their practical application must take advantage of pecUli- 
arities and particular situations which are noted in specific messages. This is especially true in 
a text on the subject. The illustration of a general principle requires a specific example, and the 
latter must of necessity manifest characteristics which make it different from any other example. 
The word BATTALION was not purposely repeated in this example in order to make the demon- 
stration of solution easy; "it just happened that way." In another example, some other entering 
wedge would have been found. The student can be expected to learn only the general principles 
which will enable him to take advantage of the speciji,c characteristics manifested in specific cases. 
Here it is desired to illustrate the general principles of solving Playfair ciphers and to point out 
the fact that entering wedges must and can l:>e found. The specific n11.ture of the entering wedge 
varies with specific examples, 
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47. Special remarks concerning the initial classification of cryptograms:-a. ~e ~tudent 


should by this time have a good conception of the basic nature of monoalphabetic substitution and 
of the many "changes" which may be rung upon this simple ~~e. .The first step of al!, .naturally, 
is to be able to classify a cryptogram properly and place it m either the transposition or the 
substitution class. The tests for this classification have been given and as a rule the student 
will encounter no difficulty in this respect. 
. 
• 


b. There are however certain kinds of cryptograms whose class cannot be detemuned m the. 
usual manner, as' outlined in Par. 13 of this text.· First of all there is the type of co~e ~ess~e 
which employs bona-fide dictionary words as code groups.1 
Naturally, a frequency distnbut1on 
of such a message will approximate that for normal plain text. The appearanc~ of the messag.e, 
however, gives clear indications of what is involved: The_ study of such cases will be taken up m 
its proper place. At the moment it is only necessary to pomt out that ~~~ear~ code r::~ssages and 
not cipher, and it is for this reason that in Pars. 12 and 13th~ wor~s cipher and cipher mes- 
sages" are used the word "cryptogram" being used only ~re,techmcally correct. , 
. · c. SeCQnd.l~, there come the unusual and border~e· .ease~.' incl~qing cryp~gr~ms whose 
nature and type can not be ascert~ined from f~~~e:p.cy d1stnbu.t1~ns. 
~ere, the cryptograms.are 
technically not ciphers but special forms of dISgmsed secret wntmgs which are rai,:ely susceptib~e 
of b~g classed as transposition or substitution. 
T~ese include a large sh.are of the cases wh.erei!1 


the cryptographic messages are clisguised and c.arried under an ~xternal, ~nocuous text which .JB 
innocent and seemingly without cryptographic cont~~t-for mstance, ~ a message wher~ 
specific letters are indicated in a way not open to susp1c1on under censorship, these lette~ be~g 
intended to constitute .the letters of the cryptographic message and the other letters constitutmg 
"dummies." Obviously, no amount of frequency tabulations will av~ .a competen~, expert 
cryptanalyst in demonstrating or disclosing the presence of a cryptograp~c m~age,, wntten an._d 
secreted within the "open" message, which serves but as an envelop and dis.guise for I~ authentic 
or real import. Certainly, such frequency tabulations can disclose the existence neiJher of sub- 
stitution nor transposition in these cases, since both forms .are ab~ent.. Anot~er very popular 
method that resembles the method mentioned above has for its basis a srmple gnlle. The ~h_?le 
words forming the secret text are inserted within perforations cut in the pap~r and the rem8:1Dlllg 
space filled carefully, using "nulls" and "dummies", making a see~gly mn?cuous, ordmary 
message. There are other methods of this general type which can obviously n~1~her be detected 
nor cryptanalyzed, using the principles of frequency of recurrex:ices and repetlt10I>:. 
These c~ 


not be further discussed herein, but at a subsequent date a special text may be wntten for thell' 
handling.11 


1 See Sec. XV, Elementary Military Cryptography. 
. 
. 


3 The subparagraph which the student has just read (47c) contains a h1dc.len cryptographic :qi.essage. 


the hints given in Par. 35e let the student see if he can find it. 


With 
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48. Ciphers employing characters other than letters or figures.-a. In view of the fore- 
going remarks, when so-called symbol ciphers, that is, ciphers employing peculiar symbols, 
signs of punctuation, diacritical marks, figures of "qancing men", and so on are encountered 
in practical work nowadays, they are almo~t ~rtaln to be simple, monoalphabetic ciphers. 
They are adequately described in romantic tales,3 in popular books on cryptography, and in 
the more common types of magazine articles. No further space need be given ciphers of this 
ijpe in this text, not only because of their simplicity but also because they are tm.eoun~red 
in military cryptography only in sporadic instances, principally in censorship actiVities. Even 
in the latter cases, it is usually foliild that such ciphers are employed in "intimate" correspondence 
for the exchange of sentiments that appear less decorous when set forth in plain language. They 
are very seldom ueed by authentic enemy agents. When such a cipher is encountered nowadays 
it may practically always be regarded as the work of the veriest tyro, when it is not that of a 
"crank" or a mentally-deranged person. 
b. The usual preliminary procedure in handling such cases, where the symbols may be some- 


what confusirig to the mind because of their unfamiliar appearance _to the eye, is to substitute 
l,etters for them consis,tently throughout the message and then treat the resulting text as an ordi- 
~ 
crj'ptO~am composed of let~ers is treated. This procedure also facilitates the construction 


of the nee~ frequeney_dlstiibutio:q.s, which would be tedfous to construct by using symbols. 
c. A fuia1 word niU:st be said on the subject of symbol ciphers by way of caution. Wheri 


Sjlllbol~ are ~ed: to replace letters, syllables, .and entire words, then the systems approach code 
methods iii prlrieiple, and can be;et>me difficult of solution.' The logical extension of the tise of 
symbols in s'1ch a fdrm of writing b1 the employment of arbitrary characters for a specially 
developed ''shorthand" system. bea.Hlig little or no resemblance to well-known, and therefore 
nonsecret, systerils of shorthand;'su1c!l' 6.s Gregg, Pitman, etc. Un1e8s a eonsideMble amotiiit 
of text is available for analysis, a pnva.tely-devised shorthand may be very diftictilt to solve. 
Fortunately, such systems are rarely encountered in military cryptography. They fall under the 
heading of cryptographic curiosities, of interest to the cryptanalyst in his leisure moments.• 
d. In practical cryptography today, as has been stated above, the use of characters other 


than the 26 letters of .the English alphabet is comparatively rare. It is true that there a.re a 
few gover:pments which still adhere to systems yielding cryptograms in groups of figures. Theee 
are almost fu every ease code syatems and will be treated in their proper plitee. In some ease8 
cipher systems, or systems of enciphering code are used which are basically mathematical in 
character and operation, and therefore use numbers instead of letters. Some persons are 
inclined toward the use of numbers rather than letters because numbers lend themselves much 
more readily to certain arithmetical operations such as addition, subtraction, and so on, than 
do letters.• But there is usually added some final process whereby the figure groups are con- 
verted into letter groups, for the sake of economy in transmission. 


1 The most famous: Poe's The Gold Bug; Arthur Conan Doyle's The Sign of Four. 
'The use of symbols for abbreviation and speed in writing goes back to the days of antiquity. Cicero is 
reported to have drawn up "a book like a dictionary, in which he placed before each word the notatfon (symbol) 
which should represent it, and so great was the number of notations and words that whatever could be written in 
Latin could be expressed in his notations." 
6 An example is found in the famous Pepys Diary, which was written in shorthand, purely for his own eyes 
by Samuel Pepys (1633-1703). "He wrote it in Shelton's system of tachygraphy (1641), which he complicated 
by using foreign languages or by varieties of his own invention whenever he had to record passages least fit to be 
seen by his servants, or by 'all the world.' " 
1 But, this of course, is because we are taught arithmetic by using numbers, based upon the decimal syatem 
as a rule. By special training one could learn to perform the usual "&1ithmetical" operations using letters. 
For example, using our English alphabet of 26 letters, where A=l, B=2, C=3, etc., it is obvious that A+B=C, 
just as 1+2=3; (A+B)1 =1, etc. This sort of cryptographic arithmetic eould be learned by rote, just as 
multiplication tables ~ 
l1;1MDe4. 
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e. The oniy notable exceptions to the statement contained in the first sentence of t'lu~ pr&- 


ceding subparagraph are those of Russian messages transmitted in the Russian Morse alphabet 
and Japanese messages transmitted in the Ka.ta Kana Morse alphabet. As regards Chinese, 
which is not an alphabetical language and comprises some 40,000 ideographs, since the Morse 
telegraph code comprises only some 40 combinations, telegrams in Chinese are usually prepared 
by means of codes which permit of substituting arbitrarily-assigned code groups for the char- 
acters. Usually the code groups consist of figures. One such code known as the Official Chinese 
Telegraph Oode, has about 10,000 4-figure groups, beginning with 0001, and these are arranged 
so that there are 100 characters on each page. Sometimes, for purposes of secrecy or economy, 
these figure groups are enciphered and converted in letter groups. 
49. Concluding remarks concerning monoalphabetic substitution.-a. The alert student will 
have by this time gathered that the solution ofmonoalphabetic substitution ciphers of the simple 
or fixed type are particularly easy to solve, once the underlying principles are thoroughly Under- 
stood. As in other arts, continued practice with examples leads to facility and skill in solution, 
especially where the student concentrates his attention upon traffic all of the same general nattii'e, 
so that the type of text which he is continually encountering becomes familiar to him and its 
peculiarities or characteristics of construction give clues for short cuts to solution. It is true 
that a knowledge of the general phraseology of messages, the kind of words used, their sequences, 
and so on, is of very great assistance in practical work in all fields of cryptanalysis. The student 
is urged to note particularly these finer details in the course of his study. 
b. Another thing which the student should be on the lookout for in simple monoalphabetic 
substitution is the consecutive use of several different mixed cipher alphabets in a single long 
message. Obviously, a single, composite frequency distribution for the whole message will not 
show the characteristic crest and trough appearance of a simple monoalphabetic cipher, since a 
given cipher letter will represent different plain-text letters in different parts of the message. 
But if the cryptanalyst will carefully observe the distribution as it is being compiled, he will 
note that at first it presents the characteristic crest and trough appearance of monoalphabeticity, 
and that after a time it begins to lose this appearance. If possible he should be on the lookout 
for some peculiarity 'of grouping of letters which serves as an indicator for the shift from one 
cipher alphabet to the next. If he finds such an indicator he should begin a second distribution 
from that point on, and proceed until another shift or indicator is encountered. By thus isolating 
the different portions of the text, and restricting the frequency distributions to the separate 
monoalphabets, the problem may be treated then as an ordinary simple monoalphabetic sub- 
stitution. Consideration of these remarks in connection with instances of this kind leads to the 
comment that it is often more advisable for the cryptanalyst to compile his own data, than to 
have the latter prepared by clerks, especially when studying a system de novo. For observations 
which will certainly escape an untrained clerk can be most useful and may indeed facilitate 
solution. For example, in the case under consideration, if a clerk should merely hand the \tni- 
literal distribution to the cryptanalyst, the latter might be led astray; the appearance of the 
composite distribution might convince him that the cryptogram is a good deal more complicated 
than it really is. 
c. Monoalphabetic substitution with variants represents an extension of the basic principle, 
with the intention of masking the characteristic frequencies resulting from a. strict monoalpha- 
beticity, by means of which solutions are rather readily obtained. Some of the subterfuges 
applied on the establishment of variant or multiple values are simple and more or less fail to 
serve the purpose for which they are intended; others, on the contrary, may interpose serious 
difficulties to a straightforward solution. But in no case may the problem be considered of more 
than ordinary difficulty. Furthermore, it should be recognized that where these subterfuges 
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are really adequate to the purpose, the complications introduced are such that the practical 
manipulation of the system becomes as difficult for the cryptographer as for the cryptanalyst. 
d. As already mentioned in monoalphabetic substitution with variants it is most common 
to employ figures or groups of figures. The reason for this is that the use of numerical groups 
seems more natural or easier to the uninitiated than does the use of varying combinations of 
letters. Moreover, it is easy to draw up cipher alphabets in which some of the letters are 
represented by single digits, others by pairs of digits. Thus, the decomposition of the cipher 
text which is an irregular intermixture of uniliteral and multiliteral equivalents, is made more 
complicated and correspondingly difficult for the cryptanalyst, who does not known which 
digits are to be used separately, which in pairs. 
. 


e. A few words may be added here in regard to a method which often suggests itself to lay- 


men. This consists in using a book possessed by all the correspondents and indicating the letters 
of the message by means of numbers referring to specific letters in the book. One way consists 
in selecting a certain page and then giving the line number and position of the letter in the line, 
the page number being shown by a single initial indicator. Another way is to use the entire 
book, giving the cipher equivalents in groups of three numbers representing page, line, and 
number of letter. 
(Ex.: 75-8-10 means page 75, 8th line, 10th letter in the line.) Such systems 
are, however, extremely cumbersome to use and, when the cryptographing is done carelessly, 
can be solved. The basis for solution in such cases rests upon the use of adjacent letters on the 
same line, the accidental repetitions of certain letters, and the occurrence of unenciphered words 
in the messages, when laziness or fatigue intervenes in the cryptographing.7 
• 
· j. It may also be indicated that human nature and the fallibility of cipher clerks is such 


that it is rather rare for an encipherer to make full use of the complement of variants placed 
at his disposal. The result is that in most cases certain of the equivalents will be used so iquch 
more often than others that diversities in frequencies will soon manifest .themselves, affording 
impoJ.'tant data for attack by the crypt.analyst. 
g. In the World War the cases where monoalphabetic substitution ciphers were employed 
in actual operations on the Western Front were exceedingly rare because the majority of the 
belligerents had a. fair.knowledge of cryptography. On the Eastern Front, however, the exten- 
sive use, by the poorly prepared Russian Army, of monoalphabetic ciphers in the fall of 1914 
was an important, if not the most important, factor in the success of the German operations 
during the Battle of Tannenberg.8 It seems that a somewhat more secure cipher system was 
authorized, but proved too difficult for the untrained Russian cryptographic and radio perso:i;mel. 
Consequently, recourse was had to simple substitution ciphers, somewhat interspersed with 
plain text, and sometimes to messages completely in plain language. The damage which this 
faulty use of cryptography did to the Russian Army and thus to the Allied cause is incalculabl 


h. Many of the messages found by censors in letters sent by mail during the World War 
were cases of monoalphabetic substitution, disguised in various ways. 


7 In 1915 the German Government conspired with a group of Hindu revolutionaries to stir up a rebellion in 
India, the purpose being to cause the withdrawal of British troops from the Western Front. Hindu conspirators 
in the United States were given money to purchase arms and ammunition and to transport them to India. For 
communication with their superiors in Berlin the conspirators used, among others, the system described in this 
paragraph. 
A 7-page typewritten letter, built up from page, line, and letter-number references to a book known 
only to the communicants, was intercepted by the British and turned over to the United States Government 
for use in connection with the prosecution of the Hindus for violating our neutrality. The author solved this 
meBBage without the book in question, by taking full advantage of the clues referred to. 
8 Gylden, Yves. 
Chifferbydernas lmatser I Varldskriget Till Lands, Stockholm, 1931. 
A translation under 
the title The Contribution of the Cryptographic Bureaus in the World War, appeared in the Signal Corps Bulletin 
in seven successive installments, from November-December 1933 to November-December 1934, inclusive. 
Nikolaieff, A. M. Secret Causes of German success on the Eastern Front. 
Coast Artillery Journal, September- 


October, 1935. 
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60. Analytical key for oryptanalysis.-a. It :may be of assistance to indica.te, by means of an 


outline, the relationships existing among the various cryptographic systems thus far considered. 
This graphic_ outline will be augmented from time to time as the different cipher systems a.re 
examined, and will constitute what has already been alluded to in Par. 6d and there termed an 
analytical key for cryptanalysis.' Fundamentally its nature is that of a schematic classification 
of the different systems examined. The analytical key forms an insert at the end of the book. 
b. Note, in the analytical key, the rather clear-cut, dichotomous method of treatment; that 
is, classification by subdivision into pairs. For example, in the very first step there are only 
two alternatives: the cryptogram is either (1) cipher, or (2) code. If it is cipher, it is either 
(1) substitution, (2) transpositio11. If it is a substitution cipher, it is either (1) monographic, 
or (2) polygraphic-a.nd so on. If the student will study the analytical key attentively, it will 
assist him in fixing in mind the manner in which the various systems covered thus far are related 
to one another, and this will be of benefit in clearing away some of the mental fog or haziness 
from which he is at :first apt to suffer. 
c. The numbers in parentheses refer to specific paragraphs in this text, so that the student 
may readily turn to the text for detailed information or for purposes of refreshing his memory 
as to procedure. 
d. In addition to these reference numbers there have been affixed to the successive steps 
in the dichotomy, numbers that mark the "routes" on the cryptanalytic map (the analytical 
key) which the student cryptanalyst should follow if he wishes to facilitate his travels along the 
rather complicated and difficult road to success in cryptanalysis, in somewhat the same way in 
which an intelligent motorist follows the routes indicated on a geographical map if he wishes to 
facilitate his travels along unfamiliar roads. The analogy is only partially valid, however. 
The motorist usually knows in advance the distant point which he desires to reach and he pro- 
ceeds thereto by the best and shortest route, which he finds by observing the route indications 
on a map and following the route markers on the road. Occasionally he encounters a detour 
but these are unexpected difficulties as a rule. Least of all does he anticipate any necessity for 
journeys down what may soon turn out to be blind alleys and "dead-end" streets, forcing him 
to double back on his way. Now the cryptanalyst also has a distant goal in mind-the solution 
of the cryptogram at hand-but he does not know at the outset of his journey the exact spot 
where it is located on the cryptanalytic map. The map contains many routes and he proceeds 


• This analytical key is quite analogous to the analytical keys usually found in the handbooks biologists 
commonly employ in the classification and identification of living organisms. In fact, there are several points 
of resemblance between, for example, that branch of biology called taxonomic botany and cryptanalysis. In 
the former the first steps in the classificatory process are based upon observation of externally quite marked 
differences; as the process continues, the observational details become finer and finer, involving more and more 
difficulties as the work progresses. 
Towards the end of the work the botanical taxonomist may have to dissect 
the specimen and study internal characteristics. The whole process is largely a matter of painstaking, accurate 
observation of data and drawing proper conclusions therefrom. 
Except for the fact that the botanical taxonomist 
depends almost entirely upon ocular observation of characteristics while the cryptanalyst in addition to observa- 
tion must use some statistics, the steps taken by the former are quite similar to those taken by the latter. It is 
only at the very end of the work that a significant dissimilarity between the two sciences arises. If the botanist 
makes a mistake in observation or deduction, he merely fails to identify the specimen correctly; he has an 
"answer"-but the answer is wrong. 
He may not be cognizant of the error; however, other more skillful botanists 
will find him out. 
But if the cryptanalyst makes a mistake in observation or deduction, he fails to get any 


"answer" at all; he needs nobody to tell him he has failed. 
Further, there is one additional important point of 
difference. 
The botanist is studying a bit of Nature-and she does not consciously interpose obstacles, pitfalls, 
and dissimulations in the path of those trying to solve her mysteries. 
The cryptanalyst, on the other hand, is 
studying a piece of writing prepared with the express purpose of preventing its being read by any persons for 
whom it is not intended. 
The obstacles, pitfalls, and dissimulations are here consciously interposed by the one 
who cryptographed the message. 
These, of course, are what make cryptanalysis different and difficult. 


---------------------------------------------------------------------~--~---~-"~-~·--·--··-- 


t'o test them one by one, ih a sucees&ive 'chitin: He encounters rii.any ·bi.ind alleyl!i and dead-end 
streets, which force him to retrace his steps; he makes many detours and jumps many hurdles. 
Some of these retracings of steps, doubling back on his tracks, jumping of hurdles, and detours 
are unavoidable, but a few are avoidable. If properly employed, the analytical key will help 
the careful student to avoid those which should and can be avoided; if it does that much it will 
serve the principal purpose for which it is intended. 


e. The analytical key may, however, serve another purpose of a somewhat different nature. 
Whens. multitude of cryptographic systems of diverse types must be filed in some systematic 
manner apart from the names of the correspondents or other reference data, or if in conducting 
instructional activities classificatory designations are desirable, the reference numbers on the 
analytical key may be ma.de to serve as "type numbers." Thus, instead of stating that a given 
cryptogram. is a keyword-systematieally-mixed-unilitera.l-monoalphabetic-monographic substitu- 
tion cipher one may say that it is a "Type 901 cryptogram." 


j. The method of assigning type numbers is quite simple. If the student will examine the 
numbers he will note that successive levels in the dichotomy are designated by successive hun- 
dreds. Thus, the first level, the tlassifieation into cipher and code is assigned the numbers 101 
and 102. On the second level, under cipher, the classification into monographic and polygraphie 
systems is assigned the numbers 201 and 202, etc. Numbers in the same hundreds apply 
therefore to systems at the same level in the classification. There is no particular virtue in th.ii:J 
scheme of assigning type numbers except that it provides for a considerable degree of expansion 
in future studies. 
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TABLJ11 l-A.-Ab1olutejreq:u,e~8 of lsllerB appwring in.five seta of Gooern/T1Unlal plain-Im uugrams, 
~set containing 10,000 letters, arra~ aJ,pkabeliBJIJ,y 


SetNo. l 
Set No.2 
Set No. 3 
SetNo.4 
s.&No.I 


Letter 
Absolute 
Lett.er 
Ablolute 
Lett.er 
Absolute 
Lett.er 
Absolute 
Letter 
. Absolute 
:frequency 
Frequency 
Frequency 
Frequeney 
Frequency 


A. 
738 
A 
783 
A 
681 
A 
740 
A 
'l41 
B----·- 
104 
s ___ 
103 
s ________ 
98 
8 _______ 
83 
8 ______ 
99 
c _____ 
319 
c 
300 
c _______ 
288 
c _____ 
326 
c ___ 
301 
o ______ 
387 
o ______ 
03 o _____ 
423 
o ____ 
451 
D 
- 
448 
E.. ___ 1,367 
E ______ 1, 294 


E_ _______ 1,292 
E 
- 1,270 
B 
1,275 


.F 
253 
F. 
287 
F 
308 
F 
287 
F 
281 
G 
166 
G _______ 
175 
G 
161 
G 
167 
G 
150 


H_ 
310 
H 
351 
H 
- 
335 
H_ 
349 
H 
- 
349 
I 
742 
I._ ____ 
750 
I ______ 
787 
I--------- 
700 
I 
697 
J_ 
18 
J_ 
17 
J ___ 
10 
J ______ 
21 
J ___ 
16 


K. 
36 


K__ _____ 
38 
K------- 
22 
K ______ 
21 
K.._ __ 
31 


L._ 
365 
L__ _______ 
393 
L _____________ 
333 
L ____ 
386 
[.__ _____ 
344 


M_ ____ 
242 
IL ____ 
240 
y _________ 
238 
y _____ 
249 
.._ _______ 
268 


ff _____ 
786 
N _______ 
794 
N----- 
815 
ff _______ 
800 
ff _______ 
780 
o ______ 
685 
Q ___________ 
770 
o _________ 
791 
o ________ 
756 
Q _______ 
762 
p _______ 
241 
p _____________ 
272 
p ---------- 
317 
p ________ 
245 
p ____ 
260 
Q_ ____ 
40 
Q_ _________ 
22 
Q_ _______ 
45 
Q_ _______ 
38 
Q_ ___ 
30 


R 
760 
R_ ________ 
745 


R_ _____ 
762 
R ________ 
735 
R_ ___ 
786 
s _____ 
658 
s ____________ 
583 
s ________ 
585 
s ___________ 
628 
s ________ 
604 
T ________ 
936 
T ____________ 
879 
T ______________ 
894 
T _______________ 
958 
T ________ 
928 
u __________ 
270 
u _____________ 
233 
u __________ 
312 
u ______________ 
247 
u _____ 
238 
v _______ 
163 
v _____________ 
173 
v _______ 
142 
v __________ 
133 
v ____ 
155 
w __________ 
166 
w ___________ 
163 
w _______ 
136 
w _______ 
133 
w ________ 
182 


x_ -.. ------- 
43 
x __________ 
50 
x ____________ 
44 
x_ ________ 
53 
x__ ___ 
41 
y ________ 
191 
y ____________ 
155 
y ____________ 
179 
y ____ 
213 
y _____ 
229 
z _________ 
14 
z ____________ 
17 
z_ ___________ 
2 
z ______ 
11 
z_ ____ 
5 


Total ____ 10,000 --------------- 10, 000 
.................... _ 
................. 10,000 
................... _.., ____ 10, 000 ---·---- 
10, 000 
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TABLE 2-A.-Absol'llU fr~ 
of letters appearing in th8 combined fee sets of messages totalling 


IJ0/)00 leUers, arranged alphabetically 


A______ 3, 683 
G______ 
819 
L______ 1, 821 
Q______ 
17 5 
B______ 
487 
H______ l 1 694 
M______ 1, 23 7 
R______ 31 7 88 


C...... 1, 534 
!._____ 3 1 6 7 6 
N...... 3, 9 7 5 
$______ 3, O 58 
D ______ 2, 122 
J______ 
82 
Q ______ 3, 764 
T ______ 4, 595 


E. _____ 6, 498 
K._____ 
148 
. p ______ 1, 335 
u ______ 1, 300 


F ______ 1, 416 


v __________ 16 6 


w__________ 1so 
x__________ 2 31 
y__________ 9 6 7 
z__________ 
4 9 


TABLE 1-B.-Ab.Bolutefr~~s of leUu-11 appearing in.five sets of Government pl,ain-text telegrams, 


· 
each set containing 10,000 letters, arranged according to frequency 


Bet No. l 
SetNo.2 
Bet No. 3 
Set No.4 
Set No.II 


Letter 
Absolute 
Letter 
Absolute 
Lett.er 
Absolute 
Letter 
Absolute 
Letter 
Absolute 
Frequency 
Frequency 
Frequency 
Frequency 
Frequency 


E.. _________ 1, 367 


E_ __________ 1, 294 
E __________ 1, 29.2 
E----------- 1,270 
E ______________ 
1,275 
T __________ 
936 
T •• ---------- 
879 
T ______________ 
894 
T ________________ 
958 
T ______________ 
928 
ff ______ 
786 
ff ___________ 
794 
H---------- 
815 
: N, __________ 
800 
ft_ ___________ 
786 
R_ _____ 
760 
A _____________ 
783 
o __ _:_ ______ 
791 
Q_, __________ 
756 
N------------ 
780 
I _____ 
742 
0 .. -- 
.. --- 
770 
I-------~------ 
787 


A_ ___________ 
740 
o ______________ 
762 
A.___;. ____ 
738 


I__ ____ 
750 
ft_ ____________ 
762 


R_ _______ 
?as 
A_ ________ 
741 
o _____ 
685 


R_ ________ 
7.45 
A. ______ 
681 
r_ ______ 
100 
I----------- 
697 
s 
658 .s . ...,. __ 
583 
s,._____ 
585 
. $;...__,____. __ 
Q28 
$ ________ 
604 
o ____ 
387 
D ____________ 
413 
o _____ 
423 
o _______________ 
451 
D----'------- 
448 
L..__ ___ 
365 
L_ _____ 
393 


H_ _______________ 
335 
L _______________ 
386 
H_ _____________ 
349 
c 
319 


H_ _______ 
351 


L_ ___________ 
333 
H_ ___________ 
349 
L_ _________ 
344 
H 
310 
c __________ 
300 
p ·-------------- 
317 
c ________________ 
326 
c _____________ 
301 
U. 
HO , _____ 
287 
u~------- 
312 
F ___________ 
287 
F _____________ 
281 
F ___ 
253 
p _____ 
272 
F _______ 
308 
IL __________ 
249 
~--------- 
268 


II 
-- 
242 
.._ ______ 
240 
c ________ 
288 
u ______________ 
247 
p ------------- 
260 
p 
24:1 
u _____ 
233 
II 
238 
p _______ 
245 
u __________ 
238 
y 
- 
191 
G _________ 
175 
y ___________ 


17~ 
y_ 
-- 
213 
y _____ 
229 
(L__ __ 
166 
v ________ 
173 
G ____________ 
161 
G ________ 
167 •----- 
182 


JI 
166 
w _______ 
163 
y _______________ 
142 
v ________ 
133 
v _____ 
155 
v _____ 
163 
y ___________ 
155 
w 
_______________ 
136 
w _________ 
133 
G ____ 
150 
s ______ 
104 
B .• --------- 
103 
B _______________ 
98 
B ____________ 
83 
B _________ 
99 
x_ ______ 
,3 
x_ _________ 
50 
Q_ _____ 
45 
x_ ____________ 
53 
x_ ___________ 
41 
Q_ _______ 
40 
K ••• ---------- 
38 
x_ ________ 
44 
Q ________________ 
38 
K. _____________ 
31 
K. ____ 
36 
Q _____________ 
22 
K_ ___________ 
22 
K ________________ 
21 
Q _____________ 
30 
J _______ 
18 
J _____________ 
17 
J ------------- 
10 
J .~-------------- 
21 
J __________ 
16 
z_ _____ . 
14 
z _____________ 
17 
z _____________ 
2 
z ________________ 
11 
z _____________ 
5 


Total ____ 10,000 -------------- 10,000 -----·---------- 10,000 ......................................... 10, 000 --------------- 10,000 


148274-88--8 


I 
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TABLE 1-C.-AbBolute frequencie8 of vowel8, high frequency conBonantB, medium frequency con- 


sonants, and low frequency conBonants appearing in five set8 of Government plain-text tele- 
grams, each Bet containing 10,000 letter8 


Set No. 
Vowels 
H~h Frequency 
Medium Fre- 
Low Fr'equency 
quency CollllO- 
onsonants 
nBnts 
Consonants 


L---------------------------------------------------------------------- 
2----------------------------------------------------------------------- 
3 __ --------------------------------------------------------------------- 
4 ______________________________________________________________________ _ 


5----------------------------------------------------------------------- 


Total 1 _______ ------------------------------------------ ____ _ 


1 GrBnd total, 60,000. 


3,993 
3,985 
4,042 
3,926 
3,942 


19,888 


3,527 
3,414 
3,479 
3, 572 
3,546 


17,538 


2,329 
151 
2,457 
144 
2,356 
123 
2,358 
144 


·2, 389 
123 


11, 889 
685 


TABLE 2-B.-Absolutefrequencie8 of letters appearing in the combined.five sets of mesBages totalling 


50,000 letters arranged according to frequencieB 


E __________ 6, 498 
! __________ 3, 67_6 
C.~-------- 1,.534 
y __________ 
967 
x_ _________ 
231 
T __________ 4, 595 s. _________ 3,058 F __________ 1, 416 
G __________ 
819 
Q_ _________ 
175 
N __________ 3,975 
D __________ 2, 122 
p __________ 1,335 w 
__________ 
780 
K_ ________ 
148 
R. .•.•.•... 3,788 
L.. .•••••••. 1, 821 u __________ 1, 300 v __________ 
766 
J ---------- 
82 
Q __________ 3,764 
H __________ 1,694 
M_ _________ 1, 237· a __________ 
487 
z.-----~--- 
49 
A.--------- 3,683 


TABLE 2-0.-Absolute frequencies of vowels, high fteqiuncy cOnBonant81 medium frequency con- 


sonants, and low frequency consonants appearing in the combined.five sets of mt8Bages totalling 
50,000 letters 


Vow els __ -------______ .... _________ . ________ •.... __ . ___ ._ .. __ ... _ ...... __ ._ ....... ___ ._ ............. _________ ....• 
High Frequency Consonants (D, N, R, S, and T). __ --···- _ -········-···---·-····-···----···------- 
Medium Frequency Consonants (B, C, F, G, H, L, M, P, V, and W) _____________________________ _ 
Low Frequency Consonants (J 1 K, Q, X, and Z)------------------------------------------------------ 


19,888 
17,538 
11, 889 
685 


Total_._ -. --. ___ ---------__ ---_ --· _ .. ---____ ------·-_ -· ______________________________ ... _ _ _ _ _ _ _ _ _ _ _ _ _ __ _ _ 5 0, 000 


.• ·.• 
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TABLE 2-D.-Absolutefrequencie8 of letters as initial letters of 10,000 wordBfound in GO'Dernment 


plain-text telegrams 


(1) ARRANGED ALPHABETICALLY 


A __________ 
905 
G __________ 
109 
L __________ 
196 
Q_ _________ 
30 v __________ 
77 
B __________ 
287 
H __________ 
272 
M __________ 
384 
R_ _________ 
611 
w---~------ 
320 
c __________ 
664 
! __________ 
344 
N __________ 
441 
$ __________ 
965 
x_ _________ 
4 
o __________ 
525 
J __________ 
44 
Q __________ 
646 
T __________ 1, 253 
y __________ 
88 
E __________ 
390 
K_ __ ·-·---- 
23 
p __________ 
433 u __________ 
122 z __________ 
12 


f __________ 
855 


TotaL.__ 10, 000 


(2) ARRANGED ACCORDING TO ABSOLUTE FREQUENCIES 


T __________ 1, 253 
R_ _________ 
611 
M __________ 
384 
L __________ 
196 
J ------·--- 
44 


$ __________ 
965 o __________ 
525 
! __________ 
344 u __________ 
122 
Q __________ 
30 
A __________ 
905 
N __________ 
441 w 
__________ 
320 
G __________ 
109 
K_ _________ 
23 
F __________ 
855 
p __________ 
433 
B __________ 
287 
y __________ 
88 z __________ 
12 
c __________ 
664 
E---------- 
390 
H __________ 
272 v __________ 
77 x __________ 
4 
Q __________ 
646 


TotaL. 10, 000 


TABLE 2-E.-Absolute frequencies of letters as final letters of 10,000 wordB found in Government 
plain-text telegrams 


(1) ARRANGED ALPHABETICALLY 


A __________ 
269 
G----·---·· 
225 
L.. _________ 
354 
Q __________ 
8 v ________ 
4 
B __________ 
22 
H.. _________ 
450 
M __________ 
154 
R. _________ 
769 w 
________ 
45 
c __________ 
86 
! __________ 
22 
N __________ 
872 
$ __________ 
962 
x_ _______ 
116 
o __________ 1,002 
J __________ 
6 
Q __________ 
575 
T __________ 1, 007 
y ________ 
866 
E __________ 1,628 
K_ _________ 
53 
p __________ 
213 u __________ 
31 z ________ 
9 
F __________ 
252 
TotaL__ 10,000 


(2) ARRANGED ACCORDING TO ABSOLUTE FREQUENCIES 


E __________ 1, 628 
R __________ 
769 
F __________ 
252 c __________ 
86 
! ________ 
22 
T __________ 1, 007 
Q __________ 
575 
G __________ 
225 
K ___________ 
53 z ________ 
9 
D __________ 1, 002 
H_ _________ 
450 
p __________ 
213 w 
__________ 
45 
Q_ _______ 
8 


$ __________ 
962 
L------·--- 
354 
M __________ 
154 u __________ 
31 
J ________ 
6 
N __________ 
872 
A_ _________ 
269 x __________ 
116 
B __________ 
22 v ________ 
4 
y __________ 
866 
TotaL. __ 10,000 


I 
I 


I 


1.' 
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T ABLll 3.-Relative frequencieB of letterB appearing in 1,000 kttera bas6d upon Tab'le S-B 


(1) ARRANGED ALPHABETICALLY 


A_ _________ 
73.66 
G _________ 
16.38 
L.. _________ 
36.42 Q _________ 
3.50 v __________ 
15.32 
B __________ 
9.74 
H ________ 
33.88 )l ________ 
24.74 R-------- 
75.76 w 
__________ 
15.60 


C--~------ 
30.68 
! _________ 
73.52 
N __________ 
79.50 s _________ 
61. 16 
x_ ________ 
4.62 
n _________ 
42.44 
J __________ 
1. 64 
Q _________ 
75.28 
T __________ 
91. 90 
y _ --------- 
19.34 
E _________ 129.96 
K_ ________ 
2.96 
p __________ 
26.70 u __________ 
26.00 
z_ _________ 
. 98 
F ________ 
28.32 
Total ____ 1,ooe.00 


(2) ARRANGED ACCORDING TO FREQUENCY 
E __________ 129.96 
! ________ 
73.52 c __________ 
30.68 
y ______ 
19.34 
x_ _________ 
4.62 
T __________ 
91. 90 
$ __________ 
61. 16 
F __________ 
28.32 
G __________ 
16.38 
Q_ ______ 
3.50 
N __________ 
79.50 
D~--------- 
42.44 
p __________ 
26.70 w 
__________ 
15.60 
K_ _________ 
2.96 
R __________ 
75.76 
L __________ 
Q _____ 
75.28 
H_ ________ 
A ________ 
73.66 


(3) VOWELS 


A----------------------------- 
7 3 . 6 6 
E--------------------------- 12 9. 9 6 
!----------------------------- 
7 3. 5 2 
o ___________________________ 
1 s. 2 s 
u __________________________ 
2 6. o o 
y ______________________________ 
1 9. 3 4 


T ota.L__________ 3 9 7. 7 6 


(4) HIGH-FB.EQUENCY 
CONSONANTS 


D------------------------------ 
4 2. 44 
N----------------------------- 
7 9. 5 0 


R-------------------------- 
7 5. 7 6 
$_____________________________ 
61. 16 


T----------------------------- 
91. 90 


TotaL __________ 350. 76 


36.42 u __________ 
26.00 v __________ 
15.32 J _________ 
1. 64 
33.88 
M_ _________ 
24.74 
B __________ 
9.74 
z_ _________ 
• 98 


(5) MEDIUM-FREQUENCY 
CONSONANTS 


B------------------------------ 
C ___ --------------------------- 
F ---------------------------- 
G ____________________________ _ 


H--~--------------------------­ 
L______________________________ . 


M------------------------------ 
p ------------------------------ 
v _____________________________ _ 
w 
_____________________________ _ 


9.74 


30.68 
28.32 
16.38 
33.88 
36.42 
24.74 
26.70 
15.32 
15.60 


T otaL__________ 2 3 7. 7 8 


Total____ 1,000. 00 


(6) LOW-FREQUENCY 
CONSONANTS 
·x· _________________________ _ 
Q _________________________ _ 


K-------------------------- 
J -------------------------- 
z ________________________ _ 


TotaL ____ _ 


Total 
( 3), ( 4), 


4.62 
3.50 
2.96 
I. 64 
.98 


18.70 


( 5), ( 6 )---------- 1, 000. 00 


T.utLlll 6.-Frequency distribution of digraphs-Baaed on 50,000 letters of Gof>emment plain-text telegrams; reduced to 5,000 digraphs 


SECOND LETTER 
Total 


A B c D 
E 
F G H I 
J K L 
M N 
0 
p Q R s T u v w x y z I 
Blanks 


A 
3 6 14 27 
1 
4 6 
2 17 1 2 32 14 
1 64 
2 12 
44 41 47 13 7 3 
12 
374 
3 
- 
- - 
- 
- 
- 
- - 
- 
- - - 
- 
- - - 
- - 
- - 
- 
- - 
- 
- 
·- 


B 
4 
18 
2 1 
6 
1 
4 
2 
1 
1 
2 
7 
49 
14 


- 
- 
- 
- 
- 
- 
- - 
- 
- - - 
- 
- 
- 
- 
- - - - 
- 
- 
- 
- 
- 
c 
20 
3 
1 32 
1 
14 
7 
4 
5 
1 
1 41 
4 
1 14 
4 
1 
1 
155 
8 
- - - 
- 
- 
- 
-· - 
- 
- 
- 
- 
- - - 
- 
- - 
- - - 
- - - 
- 
D 
32 4 
4 
8 33 
8 2 
2 27 1 
3 
5 
4 16 
5 2 12 13 15 
5 3 4 
1 
209 
3 
- 
- 
- 
- 
- 
- 
- - 
- 
- 
- - 
- 
- 
- 
- 
- - 
- 
- 
- 
- - - 
- 
E 
35 4 32 60 42 18 4 
7 27 1 
~I~ 


111 12 20 12 87 54 37 
3 20 7 7 4 1 
648 
1 
- 
F 
5 
2 
1 10 11 1 
39 
2 
1 
40 
1 
9 
3 11 
3 
1 
1 
141 
9 


- 
- - 
- 
- 
- 
- 
- 
- 
- 
- - 
- 
- 
- 
- 
- - 
- 
- 
- 
- - 
- - 
G 
7 
2 
1 14 
2 1 20 
5 1 
2 
1 
3 
6 
2 
5 
3 
4 
2 
1 
82 
7 


- 
- - 
- 
- 
- 
- 
- 
- 
- 
- - 
- - - 
- 
- - - - - 
,_ - - 
- 


H 
20 1 3 
2 20 
5 
33 
1 
2 
3 20 
1 1 17 
4 28 
8 
1 
1 
171 
7 
- 
- - 
- 
- - 
- - 
- 
- - - --- - 
- 
- - 
- - - --- 
I 
8 2 22 
6 13 10 19 
2 23 
9 75 41 
7 
27 35 27 
25 
15 
2 
368 
7 
- 
- - - 
- 
- 
- 
- 
- 
- - - 
- 
- 
- 
- 
- - 
- 
- 
- 
- - - 
- 
J 
1 
2 
2 
2 
7 
22 
- 
- - 
- 
- 
- 
- 
- 
- 
- 
- - - 
- - 
- 
- - 
- 
- 
- 
- - - - 
- 


K 
1 
1 
6 
2 
1 
1 
1 
13 
19 


- 
- - 
- 
- 
- 
- - 
- 
- - - - - 
- 
- 
- - - 
- 
- 
- - 
- 
- 
L 
28 3 
3 
9 37 
3 1 
1 20 
27 
2 
1 13 
3 
2 
6 
8 
2 2 2 
10 
183 
5 
- 
- - 
- 
- 
- 
- 
- 
- 
- 
- - 
- 
- 
- 
- 
- - 
- 
- 
- 
- - 
- 
- 


• M 
36 6 
3 
1 26 
1 
1 
9 
13 
10 
8 
2 
4 
2 
2 
2 
126 
10 
Ill 
- 
- - 
- 
- 
- 
- - 
- 
- - - 
- 
- 
- 
- 
- - 
- - - 
- - - - 
E N 
26 2 19 52 57 
927 
4 30 1 2 
5 
5 
8 18 
3 I 
4 24 82 
7 3 3 
5 
397 
2 


~ 
-1- - 
- 
- 
- 
- - 
- 
- 
- _,_ - 
- 
- 
- - - - - 
- - - - 
! 
0 


p 


7 4 
8 12 
3 25 2 
3 
5 1 2 19 25 77 
6 25 
64 14 19 37 7 8 1 2 
376 
2 


- 
- - 
- 
- 
- 
- - 
- 
- 
- - - 
- - 
- 
- 
- 
- - - 
- - - - 
14 1 
1 
1 23 
2 
3 
6 
13 
4 
1 17 11 
18 
6 
8 
3 1 1 
1 
135 
6 


- 
- - 
- 
- 
- 
- - - 
- - - 
- 
- 
- 
- 
- - 
- 
- - 
- - - 
- 
Q 
1 
1 
15 
17 
23 
- 
- - 
- 
- 
- 
- - 
- 
- 
- -.-- 
- - 
- - 
- - - - - 
- 
- 
R 
39 2 
9 17 98 
6 7 
3 30 1 1 
5 
9 
7 28 13 
11 31 42 
5 5 4 
9 
382 
3 


- 
- - 
- 
- 
- 
- 
- 
- 
- 
- - 
- 
- - 
- 
- - 
- 
- 
,-- - - - - 
s 
24 3 13 
5 49 12 2 26 34 
1 
2 
3 
4 15 10 
5 19 63 11 1 4 
1 
307 
4 


- 
- - 
- 
- 
- 
- - - 
- - - 
- 
- 
- 
- 
- - - - 
- 
- - - - 
T 
28 3 
6 
6 71 
7 1 78 45 
5 
6 
7 50 
2 1 17 19 19 
5 
36 
41 1 
454 
4 
- 
- - 
- 
- 
- 
- - 
- 
- - - 
- - - 
- 
- - 
- - - 
- - - - 
u 
5 3 
3 
3 11 
1 8 
5 
6 
5 21 
1 
2 
31 12 12 
1 
130 
9 
- 
- - 
- 
- 
- 
- - - 
- 
- - - - 
- 
- 
- - 
- 
- - 
- - 
- 
- 
v 
6 
57 
12 
1 
1 
77 
21 
- 
- - 
- 
- 
- 
- - 
- 
- - - 
- 
- 
- 
- 
- - 
- - - 
- - - 
- 
w 12 
22 
4 13 
1 
2 19 
1 
1 
1 
76 
16 


I--- - - 
- 
- 
- 
- - - 
- 
- - - - 
- 
- 
- - 
- - - 
- 
- - - 
x 
2 
2 
1 
1 
1 
1 
2 
1 
1 
2 
1 
1 
7 
23 
13 


I--- - - 
- 
- 
- 
- - 
- 
- 
- -,-- 
- 
-.-- 
- - 
- 
- - 
- 
- 
.Y 
6 2 
4 
4 
9 11 1 
1 
3 
2' 
2 
6 10 
3 
4 11 15 
1 
I 
96 
7 


- 
- - 
- 
- - 
- - 
- 
- 
- - 
- 
- 
- 
- 
- - 
- - - 
- - - -· 
z 
1 
2 
1 
4 
23 


~ 
- - 
- 
- 
- 
- ·-- 
- - - 
- 
- 
- 
- 
- - 
- 
- 
- 
- - - - 
·- 
¥------- 370 46 154 217 657 137 82 170 374 814 189 123 397 373 130 17 368 304 462 130 75 77 23 99 4 5,000 


~ 
- - 
- 
- 
- 
- 
- 
- 
- 
- - -- - 
- 
- 
- - 
- - - - - ,_ - 
1 11 
6 
7 
1 
712 10 
318 19 
6 
6 
7 
3 
821 
4 
4 
5 
715 11 23 10 23 
248 


1'8274-38 (Facep.113) 
• 


113 


TABU 4.-Ji'rtq'Uiiney disfribtUiot. for 10,000 letters of lilerary English, as compiUd by Hitt 1 


}._ ______ _ 
B _____ _ 
c ________ _ 
·o ________ _ 
E ________ _ 
F _____ _ 


E._ ________ _ 
T ________ _ 
'o _____ _ 


' }._ ________ _ 
N ________ _ 
! ______ _ 


778 
141 
296 
402 
1, 277 
197 


1,277 
855 
807 
778 
686 
667 


(1) ALPHABETICALLY ARRANGED 


G-----~- 
1 7 4 
L_________ 
3 7 2 
Q_ ________ _ 


H.----~- 
595 
LL______ 
288 
R._ _______ _ 


!_________ 
667 
N________ 
686 
$_ ________ _ 
J__________ 
51 
Q_________ 
807 
T _________ _ 


JL _____ .. _ 
7 4 
p__________ 
223 
U ________ _ 


8 
651 
622 
855 
308 


(2) ARRANGED ACCORDING TO FREQUENCY 


R______ 
651 
u__________ 
so8 
y__________ 
196 
s_________ 
&22 
c_________ 
296 
w_________ 
116 


H.-------~- 
595 
M.._________ 
288 
G_________ 
17 4 


o__________ 
4-02 
p__________ 
223 
a________ 
141 


L.. _____ i__ 
3 7 2 
F __________ 
19 7 
y __________ 
112 


v ________ _ 
w 
_________ _ 


x__ ______ _ 
y ______ _ 
z_ ______ _ 


K_ _______ _ 
J _________ _ 
x_ ________ _ 
z_ ________ _ 
Q_ ________ _ 


112 
176 
27 
196 
17 


74 
51 
27 
17 
8 


TAJJLE 5.-Fre<[Utfl.Cy distrihtion for 10,0()() letters of telegraphic English aa compiled by HiJt 


(1) ALPHABETICALLY ARRANGED 


>.__________ 
813 
G-------~- 
201 
L..______ 
3 9 2 
Q_ ________ _ 
38 v ________ _ 


B__________ 
140 
H_________ 
386 
M..________ 
273 
R ________ _ 
611 w 
________ _ 


C__________ 
306 
!________ 
711 
N_________ 
718 
$ ________ _ 
6 5 6 
x_ ________ _ 
o_________ 
417 
J----~--- 
42 o_______ 
844 
T _________ _ 
634 
y _________ _ 


E._________ 1, 319 
K..----~- 
88 
P------~-- 
243 
U _________ _ 
a 21 z _________ _ 
F_________ 
205 


E._ _______ _ 
o _______ _ 
}._ ______ _ 
N _________ _ 
! ________ _ 


R._ ______ _ 


1, 319 
844 
813 
718 
711 
677 


(2) ARRANGED ACCORDING TO FREQUENCY 
$_________ 
656 
U________ 
321 
F--~------· 
205 
K_ ________ _ 


T-----~--- 
634 
C________ 
306 
G__________ 
201 
x_ _______ _ 


D_______ 
417 
M..______ 
2 7 3 
W_________ 
16 6 
J ______ _ 


L_______ 
392 
p________ 
243 
B__________ 
149 
Q_ ____ _ 


H__________ 
386 
y ____ ··--- 
208 
V__________ 
136 
z_ _______ _ 


136 
166 
51 


208 


6 


88 
51 
42 
38 
6 


1 Hitt, Capt. Parker. Manual for the Solution of Milwary Ciphera. 
Army Service Schools Press, Fort 
Leavenworth, Kansas, 1916. 
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TABLE 7-A.-Tht 438 dijferM/J digraphs of tabu 6 arranged according to their absolute frequencies 


I 
! 


~' 


J 


~., 


'I 
I 


'I 


EN ______________ 
111 
EC ______________ 
RE ______________ 
98 
RS ______________ 


ER_ _____________ 
87 
UR ______________ 
NT ______________ 
82 
NI _____________ . 


T}L _____________ 
78 
RI.------------- 
ON ______________ 
77 
EL_. ____________ 
IN ______________ 
75 
HT ______________ 


TE. _____________ 
71 
LA_ _____________ 
AN _____________ 
64 
RQ ______________ 


QR_ _____________ 
64 
TA ______________ 


ST _____________ 
63 
ED. _____________ 
60 
'2, 
NE ______________ 
57 
LL ______________ 


VE-------------- 
57 
AD ______________ 


ES ______________ 
54 
DI ______________ 
ND _________ ._ ____ 
52 
EI.------------- 
TO ______________ 
50 
IR_ _____________ 


SE _____________ 
49 
.IT ______________ 
NG ______________ 


1 1, 249 
ME ______________ 
AT _____________ 
47 
NA _____________ 
TI _____________ 
45 
SH_ _____________ 
AR ______________ 
44 
IV ______________ 
EE ______________ 
42 
OF ______________ 
RT ______________ 
42 
QM_ _____________ 


AS. _____________ 
41 
Qp ______________ 
co ______________ 
41 
NS ______________ 
IQ ______________ 
41 
SA.. _____________ 
TY ______________ 
41 
IL ______________ 


FO ______________ 
40 
PE.·------~----- 
FI ______________ 
39 
IC.------------- 
RA. _____________ 
39 
WE.------·------ 
ET ______________ 
37 
UN ______________ 
ou ______________ 
37 
CA ______________ 


·LE ______________ 
37 
EP ______________ 
MA ______________ 
36 
EV ______________ 
TW ______________ 
36 
GH ______________ 
EA ______________ 
35 
HA ______________ 
rs ______________ 
35 
HE ______________ 


SI ______________ 
34 
HQ ______________ 
DE ______________ 
33 
LI ______________ 
HI ______________ 
33 
ss ______________ 
AL ______________ 
32 
TT ______________ 


CE ______________ 
32 
I G ______________ 
DA ______________ 
32 
NC ______________ 


1 The 18 digraphs above this line compose 253 of the total. 
1 The 63 digraphs above this line compose 5()% of the total. 


I The ll 7 digraphs above this lUI~ (lOIIlpos8 76% of t'1e wt~. 


l l, 
--- --·--~--------------- 
---- 
-- 


32 
OL ______________ 
19 
us ______________ 
12 
31 
OT ______________ 
19 
UT ______________ 
12 
31 
TS ______________ 
19 
VI ______________ 
12 


30 
wo ______________ 
19 
WA ______________ 
12 


30 
BE ______________ 
18 
FF ______________ 
11 
29 
EF ______________ 
18 
pp ______________ 
11 
28 
NO ______________ 
18 
RR_ _____________ 
11 
28 
PR ______________ 
18 
UE ______________ 
11 
28 
AI ______________ 
17 
FT ______________ 
11 
28 
HR_ _____________ 
17 
su ______________ 
11 
PO ______________ 
17 
YF ______________ 
11 
495 
RD _____________ 
17 
YS ______________ 
11 
27 
TR_ _____________ 
17 
YQ ______________ 
10 


27 
oo ______________ 
16 
FE ______________ 
10 


27 
DT ______________ 
15 
IF ______________ 
10 


27 
IX.. _____________ 
15 
LY ______________ 
10 


27 
QU ___________ ,_ 
15 
MQ ______________ 
10 


27 
so _____________ 
15 
sp ______________ 
10 


27 
YT _____________ 
15 
YE_ _____________ 
9 
26 
AC ______________ 
14 
FR_ _____________ 
9 
26 


AM_ _____________ 
14 
IM. _____________ 
9 
26 


CH_ _____________ 
14 
LO ______________ 
9 
25 
CT ______________ 
14 
MI ______________ 
9 
25 
EM ______________ 
14 
NF ______________ 
9 
25 
GE ______________ 
14 
RC ______________ 
9 
25 
QS ______________ 
14 
RM_ ____________ 
9 
24 
PA ______________ 
14 
RY ______________ 
9 
24 
PL_ _____________ 
13 
DD ______________ 
8 
23 
RP-------------- 
13 
NN ______________ 
8 
23 
sc ______________ 
13 
OF ______________ 
8 
22 
WI ______________ 
13 
IA.. _____________ 
8 
22 


MM_ _____________ 
13 
HU ______________ 
8 
21 
os ______________ 
13 
LT ______________ 
8 


20 
AU ______________ 
13 
MP ______________ 
8 


20 
IE ______________ 
13 
oc ______________ 
8 


20 
LO ______________ 
13 
ow ______________ 
8 


20 
PT ____________ :. 
8 


20 
a3, 745 
UG ______________ 
8 


20 
AP ______________ 
12 
AV ______________ 
7 


20 
DR_ _____________ 
12 
BY ______________ 
7 


20 
EQ. _____________ 
12 
er ______________ 
7 
19 
AY ______________ 
12 
EH_ _____________ 
7 
19 
EO ______________ 
12 
OA ______________ 
7 
19 
oo ______________ 
12 
EW ______________ 
7 
19 
SF ______________ 
12 
EX_ _____________ 
7 


--- --- 
---- 
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TABLE 7-A."'-The 438 different digraphs of tabk 8 arranged according to their absolutejret}_'IU'!V- 
ciea-Continued 


GA-------------- 
7 
so ______________ 
a 
ov ______________ 
3 
KI •. ---------- 
2 
IP ______________ 
7 
SR_ _____________ 
5 
AA--·-·--------- 
3 


LM_ ____________ 
2 


NU-------------- 
7 
TL...------------ 
5 
EU •••••••••••••• 
3 


LR_ _____________ 
2 
ov ______________ 
7 
TU ______________ 
5 
OE. _____________ 
3 
LU ______________ 
2 


RG-------------- 
7 
UM_ _____________ 
5 
YI. _____________ 
3 
LV .• ------------ 
2 
RN ______________ 
7 
AF ______________ 
4 
FS .............. 
3 
LW ______________ 
2 


TE-------------- 
7 
BA...-----------· 
4 
FU ______________ 
3 


MR__ __________ 
2 
TN ______________ 
7 
BQ ______________ 
4 
GN ______________ 
3 
MT •• ------------ 
2 


XT-------------- 
7 
CK_ _____________ 
4 
GS ______________ 
3 
MU ____________ 
2 


AB-------------- 
6 
CR_ _____________ 
4 
HC ______________ 
3 
MY ______________ 
2 


AG ______________ 
6 
cu ______________ 
4 
HN ______________ 
3 
NB .. ------------ 
2 
BL ______________ 
6 
DB .. ------------ 
4 
LB ______________ 
3 


NK_ _____________ 
2 
oo ______________ 
6 
DC .. ------------ 
4 
LC ______________ 
3 
OG ______________ 
2 
y .A_ _____________ 
6 
DN .. ------------ 
4 
LF ______________ 
3 
OK_ _____________ 
2 


GQ ______________ 
6 
ow ______________ 
4 
LP ______________ 
3 
PF ______________ 
2 


ID-------------- 
6 
EB ______________ 
4 
MC ______________ 
3 
RB ______________ 
2 


KE-------------- 
6 
EG .• ------------ 
4 
NP. _____________ 
3 
SG ______________ 
2 


L$ ______________ 
6 
EY ______________ 
4 
NV ______________ 
3 
SL_ _____________ 
2 


MB_:. ____________ 
6 
GT ______________ 
4 
NW ______________ 
3 
TP ______________ 
2 
PI ______________ 
6 
HS .. ------------ 
4 
OH.. _____________ 
3 
UP ______________ 
2 


pg ______________ 
6 
MS •• ------------ 
4 


AH_ _____________ 
2 
WN ______________ 
2 


RF _________ ··-·- 
6 


NH_ _____________ 
4 
AK.. ____________ 
2 
XA ______________ 
2 


TC •••.•••••••.•. 
6 


NR_ _____________ 
4 
BI ______________ 
2 
xc ______________ 
2 
TD ______________ 
6 
OB ______________ 
4 
BR _____________ 
2 
XI ______________ 
2 


™-------------- 
6 
PM...------------ 
4 
BU _____ --------- 
2 
XP ... ------------ 
2 


UL-------------- 
6 
RW .• ------------ 
4 
DG ______________ 
2 
YB ______________ 
2 


VA------------·- 
6 
SN ______________ 
4 
DH ______________ 
2 
YL_ ____________ 
2 


YN ______________ 
6 
sw ______________ 
4 
DO ______________ 
2 


YM_ _____________ 
2 


CL ______________ 
5 
WH ______________ 
4 
AO ______________ 
2 
ZE ______________ 
2 


OM_ _____________ 
5 
YC.---------·--- 
4 
OY ______________ 
2 
GG ______________ 
1 


DP ______________ 
5 
YD---------·---- 
4 
FC ______________ 
2 
AJ ______________ 
1 
ou ______________ 
5 
YR------------ 
4 
FL ______________ 
2 
BJ ______________ 
1 
QI ______________ 
5 
PH_ ____________ 
3 
GC ______________ 
2 
BM.. _____________ 
1 


UA ______________ 
5 
PU ______________ 
3 
GF ______________ 
2 
BS ______________ 
1 
ur ______________ 
5 
RH_ _____________ 
3 
GL ______________ 
2 
BT ______________ 
1 
FA _____________ 
5 
SB ______________ 
3 
GP ______________ 
2 
co ______________ 
1 
GI ______________ 
5 
SM_ _____________ 
3 
GU ______________ 
2 
CF ______________ 
1 


GR ________ ------ 
5 
TB .. ------------ 
3 
HD ______________ 
2 
CM ______________ 
1 
HF ______________ 
5 
UB ______________ 
3 
HM ____ ---------- 
2 
CN ______________ 
1 


NL-----·-------- 
5 
uc ______________ 
3 
IB ______________ 
2 
cs ______________ 
1 


NM ______________ 
5 
uo ______________ 
3 
IK----.---------- 
2 
cw ______________ 
1 


NY ______________ 
5 
yp ______________ 
3 
rz ______________ 
2 
CY ______________ 
1 


RL ______________ 
5 
cc ______________ 
3 
JE ______________ 
2 
DJ ______________ 
1 


RU ______________ 
5 
AW .. ------------ 
3 
JO ______________ 
2 
DY ______________ 
1 


RV---··---------- 
5 
DL.------------- 
3 
JU ______________ 
2 
E-l----"·~----~~- 
1 


I 
IUlllllUlllllUlllllUlllllUlllllUlllllUlllllUlllllUlllllUlllllUlllllUlllllUlllllUlllllUlllllUlllllUlllllUlllllUlllllUlllllUlllllUlllllUllll 


' 
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.. 


TA:BLB 7-A.-The 1~ dijfer1m diigrapha uj table 6 <trraft{J6d aum-m,., 16 tMir ahsohdejreq'Ueflt" 
ciea-Continued 


AJC ____________ 
1 
HY __________ 
1 
PD---------·-· 
1 
wt ______________ 
1 
uo ___________ 
1 
J A_ __________ 
1 
PN·--------·-· 
I 
WR_ ___________ 
1 
YU _____________ 
1 


KA_ ___________ 
1 
PV _____________ 
l 
ws ______________ 
1 
EZ_ ____________ 
1 
KC ___________ 
1 
PW--------·-· 
l 
WY----------···· 
1 
FD ____________ 
1 
KL_ __________ 
1 
py _____________ 
1 
XD----------·-·· 
t 
FG __________ 
1 
KN-----------·· 
1 
QM_ ____________ 
l 
XE ______________ 
1 
Fil _____________ 
1 
KB---------·-·· 
1 
QR_ ___________ 
1 
xr ______________ 
1 
FP __________ 
1 
LG _____________ 
1 
RJ ____________ 
1 
XH_ _____________ 
1 


"------------·- 
1 
LH_ __________ 
1 


RK_ ___________ 
1 
XN ______________ 
1 


FY----------· 
1 
LN---------···· 
1 
Sit __________ 
1 
xo ______________ 
1 
GD ___________ 
1 
MD----------···· 
1 
sv ____________ 
1 
XR_ _____________ 
l 
GJ ______________ 
1 
MF---------···· 
1 
SY ____________ 
1 
xs ______________ 
1 
Gil _________ 
1 


MH_ __________ 
1 
TG ____________ 
1 
YG ______________ 
1 
GI __________ 
1 
NJ ___________ 
1 
TQ_ ____________ 
1 
YH_ _____________ 
1 
HB ______ 
l 
NQ. ____________ 
1 
TZ..·--------·· 
1 
Ylf _____________ 
1 
HL__ _________ 
1 
OJ ___________ 
1 
UF ______________ 
I 
ZA ______________ 
1 
HP ______ 
1 
ox. ____________ 
1 
uv _____________ 
1 
ZI ______________ 
l 
HQ. ____ 
1 
PB ______________ 
1 
vo ____________ 
1 
--- 
HI _________ 
1: 
PC __________ . 
1 
VT ____________ 
1 
Total ______ 
5,000 


I 


TABLE 7-B.-The 1.8 di,gra,Ji,s composing t5% of th,, di,graphs in Table 6 arranged alpluibeticallr 
according to their initial letter8 


(1) AND ACCORDING TO.THEIR FINAL 
(2) AND ACCORDING TO THEIR ABSOLUTE 


LETTERS 
FREQUENCIES 
,AN ___ 
64 
ON _____________ 
77 
AN _____________ 
64 
ON ______________ 
77 
OR, ____________ 
64 
OR. _____________ 
64 


ED------------- 
60 
RE_ _____________ 
98 
EN ____________ 
111 
RE_ _____________ 
98 
EN ______________ 
111 
ER_ _____________ 
87 
ER _____________ 
87 
SE. _____________ 
49 
ED------------ 
60 
SE ______________ 
49 
ES ______________ 
54 
ST ______________ 
63 
ES ______________ 
54 
ST ______________ 
63 


TE._ _____________ 
71 
TH_ _____________ 
78 
IN __________ 
75 
m ___________ 
78 
IN ______________ 
75 
TE. _____________ 
71 
TO ______________ 
50 
TO ______________ 
50 


ND ______________ 
52 
VE. _____________ 
57 
NT ______________ 
82 
VE ______________ 
57 
NE ______________ 
57 
NE ______________ 
57 
NT------------- 
82 
Total ______ 1, 249 
NP-------------- 
52 
Totaf__. ___ 1, 249 


I 
I 
. l 


!:- 


~· 
,. 


'1'. 


~· 
H- 
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T ABLll 7:...C.-~ 
68 tligrapAscofllP#ing 6Q%oj Mt 6,(J()()digra,phl t>f Tdl.e tJ; arra1nt1ed alpAa6~y 
aaordfng to their miiial lditr8 


(1) AND ACCORDING TO THEIR FINAL 
(2) AND-ACCORDING TO THEIR ABSOLUTE 


LETTERS 
FRli_:QUENCIES 


AL. ____________ 
32 
MA_ _____________ 
36 
AN.. _____________ 
64 
MA_ _____________ 
36 


AN---------·· 
64 
AT·-·-·-··-----· 
47 
AR.,__ _________ 
44 
ND ________ 
52 
AJl __________ , 
44 
NT ______________ 
82 


AS----------· 
41 
NE_ _________ 
57 
AS----------- 
~ 
NE._ ____________ 
57 


AT~---------- 
47 
NL_ _______ 
30 
AL. ____________ 
32 
NJ:> _____________ 
52 
NT----------- 
8Z 
N,I.. ___________ 
30 
CL __________ 
3.2 
co ____________ 
41 
co. ____________ 
41 
ON ____________ 
77 
CE. ____________ 
32 
OH... __________ 
77 
QR__ __________ 
64 
OR., ___________ 
64 


DA..---------- 
32 
ou ____________ 
37 
DE. _____________ 
33 
ou ______________ 
37 


DL----··--·· 
38 
DA.. ____________ 
32 
RA_ ____________ 
39 
RE_ ____________ 
98 


EA_ _____________ 
35 
RE__ __________ 
98 
EN _____________ 
111 
RT ______________ 
42 
EC _____________ 
32 
RI _____________ 
30 
ER_ _____________ 
87 


RA_ _____________ 
39 
ED------------ 
60 
RQ ________ ----- 
28 
ED ______________ 
60 
RS-------------- 
31 


EE,; _____________ 
42 
RS_ __________ 
31 
ES-------------- 
54 


RI ____________ 
30 
Et.., ___________ 
29 


RT ___________ 
42 
EE_ ________ "--- 
42 


RO ______________ 
28 


E)T ___________ 
111 
ET _________ ---- 
37 
ER_ _________ 
87 
SS.~------------ 
49 
EA_ ____________ 
35 
gr _____________ 
63 
ES-------- 
5' 
SI---------- 
3-t 
EC _____________ 
32 
SE.. ___________ 
49 
Ef ______ 
37 
ST.·---------- 
63 
$!_ ____________ 
34 
EL. ___________ 
29 


TL ........... 
28 
TH.. ____________ 
78 
FI _________ 
39 
FO ____________ 
40 
71 
TE_ _____________ 
71 
FO. ________ 
40 
TE_ __________ 
FI------------- 
39 
Tl{. ______ 
78 
TO ____________ 
50 


HI _________ 
33 


TL __________ 
45 
TI. _____________ 
45 


TO_·----------··· 
50 


HI ___________ 
33 
TY ______________ 
41 
HT----------- 
28 
TW ____________ 
36 


HT ____________ 
28 
TW _____________ 
36 
TY ______________ 
41 
TA.. _____________ 
28 
IN ______________ 
75 
IN ______________ 
75 
IO ___________ 
41 
UR_ _____________ 
31 
IQ ______________ 
41 
UR_ _____________ 
31 
Is _____________ 
35 
Is ______________ 
35 


VE,. ___________ 
57 
VE _____________ 
57 
LA-------- 
28 
LE.. _____________ 
37 
LE--------- 
37 
Total ...... 2,495 


I,A_ _____________ 
28 
Total ______ 2,495 


---------------·---------- 


! • . 
-I 


0 


= 


c 


' 
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TABLE 7-D.-The 117 digr4pM composing 75% of the 5,000 digraph8 of Table 61 arranged alpha- 
betically according to their initial letters--- 


(1) AND ACCORDING TO THEIR FINAL LETTERS 


AC ______________ 
14 
EP ______________ 
20 
LO ______________ 
13 
RI ______________ 
30 


AD ______________ 
27 
ER ______________ 
87 
RO •• ____________ 
28 


AI ______________ 
17 
ES ______________ 
54 
MA ______________ 
36 
RS ______________ 
31 
AL ______________ 
32 
ET ______________ 
37 
ME ______________ 
26 
RT. _____________ 
42 


AM_ _____________ 
14 
EV-------------- 
20 


AN ______________ 
64 
NA ______________ 
26 
SA ______________ 
24 
AR_ _____________ 
44 
FI ______________ 
39 
NC ______________ 
19 
SE ______________ 
49 


AS ______________ 
41 
FQ ______________ 
40 
ND ______________ 
52 
SH..~------------ 
26 
AT ______________ 
47 
NE ______________ 
57 
st ______________ 
34 
AU ______________ 
13 
GE ______________ 
14 
NG ______________ 
27 
so ______________ 
15 


GH. _____________ 
20 
Nl ______________ 
30 
ss ______________ 
19 
BE ______________ 
18 
NQ ______________ 
18 
ST ______________ 
63 
HA ______________ 
20 
NS ______________ 
24 


CA ______________ 
20 
HE_ _____________ 
20 
NT ______________ 
82 
TA_. ____________ 
28 


CE_ _____________ 
32 
HI ______________ 
33 
TE ______________ 
71 


CH.--------"---- 
14 
HO ______________ 
20 
OF ______________ 
25 
™-------------- 
78 
co ______________ 
41 
HR_ _____________ 
17 
OL ______________ 
19 
TI ______________ 
45 
CT ______________ 
14 
HT ______________ 
28 
OM ______________ 
25 
TO ______________ 
50 
ON ______________ 
77 
TR_ _____________ 
17 
DA ______________ 
32 
IC ______________ 
22 
OP ______________ 
25 
TS ______________ 
19 
DE ______________ 
33 
IE ______________ 
13 
OR ______________ 
64 
TT ______________ 
19 
DI ______________ 
27 
IG ______________ 
19 
os ______________ 
14 
T'I _____________ 
36 


DO ______________ 
16 
IL ______________ 
23 
OT ______________ 
19 
TY _____________ 
41 
DS ______________ 
18 
IN ______________ 
75 
ou ______________ 
37 
DT ______________ 
15 
IO ______________ 
41 
UN _____________ 
21 


IR. _____________ 
27 
PA ______________ 
14 
UR_ _____________ 
31 
EA ______________ 
35 
IS ______________ 
35 
PE ______________ 
23 


EC ______________ 
32 
IT ______________ 
27 
PQ ______________ 
17 
VE ------------ 
57 
ED ______________ 
60 
IV ______________ 
25 
PR_ _____________ 
18 


EE ______________ 
42 
IX ______________ 
15 
WE ______________ 
22 
EF ______________ 
18 
QU ______________ 
15 
wo ______________ 
19 
EI ______________ 
27 
LA ______________ 
28 
EL ______________ 
29 
LE ______________ 
37 
RA ______________ 
39 
YT...:_ ___________ 
15 
EM ______________ 
14 
LI _____________ 
20 
RD ______________ 
17 
EN _____________ 
111 
LL ______________ 
27 
RE------------"- 
98 
Total__ ____ 3, 741) 


·. 


;~ 
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TABLE 7-D, Concluded.-The 117 digraph8 compriBing 75% ofthe 5,000 digraph8 of Table 6, 


arranged alphabetically according to their initial letters-- 


(2) AND ACCORDING TO THEIR ABSOLUTE FREQUENCIES 


AN ______________ 
64 
EL ____________ 
27 
MA ______________ 
36 
RI ______________ 
30 


AT ______________ 
47 
EP ______________ 
20 
ME ______________ 
26 
RO ______________ 
28 
AR ______________ 
44 
EV ______________ 
20 
RD ______________ 
17 
AS ______________ 
41 
EF ______________ 
18 
NT ______________ 
82 
AL __________ ---- 
32 
EM_ _____________ 
14 
NE ______________ 
57 
ST ______________ , 
63 
AD ______________ 
27 
ND ______________ 
52 
SE ______________ 
49 
AI ______________ 
17 
FQ ______________ 
40 
NI ______________ 
30 
SI ______________ 
34 


AC ______________ 
14 
FI ______________ 
39 
NG ______________ 
27 
SH_ _____________ 
26 


AM_ _____________ 
14 
NA ______________ 
26 
SA ______________ 
24 


AU ______________ 
13 
GH ______________ 
20 
NS ______________ 
24 
ss ______________ 
19 
GE ______________ 
14 
NC ______________ 
19 
so ______________ 
15 
BE ______________ 
18 
NO ______________ 
18 
HI ______________ 
33 
HT ____ · __________ 
28 
™-------------- 
78 
co ______________ 
41 
HA-------------- 
20 
ON ______________ 
77 
TE. _____________ 
71 


CE ______________ 
32 
HE ______________ 
20 
QR_ _____________ 
64 
TO ______________ 
50 
CA ______________ 
20 
HQ ______________ 
20 
OU ______________ 
37 
TI ______________ 
45 


CH. _____________ 
14 
HR_ _____________ 
17 


OF ______________ 
25 
TY ______________ 
41 


CT ______________ 
14 
QM_ _____________ 
25 
TW ______________ 
36 
IN ______________ 
75 
op ______________ 
25 
TA ______________ 
28 
DE ______________ 
33 
IQ ______________ 
41 
OL ______________ 
19 
TS ______________ 
19 
DA ______________ 
32 
IS ______________ 
35 
OT ______________ 
19 
TT ______________ 
19 


DI. _____________ 
27 
IR. _____________ 
27 
os ______________ 
14 
TR. _____________ 
17 


DO ______________ 
16 
IT ______________ 
27 
DT ______________ 
15 
IV ______________ 
25 
PE ______________ 
23 
UR ______________ 
31 
DS ______________ 
13 
IL ______________ 
23 
PR_ _____________ 
18 
UN ______________ 
21 


IC ______________ 
22 
PO ______________ 
17 
EN ______________ 
111 
IG ______________ 
19 
PA ______________ 
14 
VE ______________ 
57 


ER_ _____________ 
87 
IX ______________ 
15 
ED ______________ 
60 
IE ______________ 
13 
ES ______________ 
54 


QU ______________ 
15 
WE_ _____________ 
22 


EE ______________ 
42 
LE ______________ 
37 


WO ______________ 
19 
ET ______________ 
37 
LA-------------- 
28 
RE ______________ 
98 
EA ______________ 
35 
LL ______________ 
27 
RT ______________ 
42 
YT ______________ 
15 


EC ______________ 
32 
LI ______________ 
20 
RA-------------- 
39 
EL ______________ 
29 
LO ______________ 
13 
RS ______________ 
31 
Total__ ___ 3, 745 


TABLE 7-E.-All the 438 digraphs of Table 6, arranged.first alphabetically according to their initial 
letters and then alphabetically according to their final letters. 


(SEE TABLE 6.-READ ACROSS THE ROWS) 
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TABLE 8.-TM 488 dijferenl digraphs of Table 6, arranged finl alphabetically accorating to their· 


initial letters, and then according to their absol'IJJe freq:iuncies 1u1nder each initial letter 1 


AN ______________ 
64 
CT ____________ 
14 
ED ______________ 
60 
Gll _____________ 
20 
AT ______________ 
47 
er ______________ 
7 
ES ______________ 
54 
GE ______________ 
14 


AR_ _____________ 
44 
CL-------------- 
5 
EE ______________ 
42 
GA-------------- 
7 
AS-------------- 
41 
CK_ _____________ 
4 
ET ______________ 
37 
GO ______________ 
6 
AL------------- 
32 
CR.. _____________ 
4 
EA ______________ 
35 
GI _______________ 
5 
AD ____________ 
27 
cu _____________ 
4 
EC ______________ 
32 
GR.. _____________ 
5 
AI-------------- 
17 
cc _____________ 
3 
EL-------------- 
29 
GT ______________ 
4 
AC _____________ 
14 
co _____________ 
1 
EI ______________ 
27 
GN ______________ 
3 
AM_ ___________ 
14 
CF------------ 
1 
EP ______________ 
20 
GS ______________ 
3 
AU __________ 
13 
CM_ _____________ 
1 
EV ______________ 
20 
GC _____________ 
2 
AP------------ 
12 
CN ____________ 
1 
Ef ______________ 
18 
GF ______________ 
2 
A y _____________ 
12 
cs ______________ 
1 
EM_ _____________ 
14 
GL ______________ 
2 
AV _____________ 
7 
cw _____________ 
1 
EO ______________ 
12 
GP ______________ 
2 
AB-------------- 
6 
CY ______________ 
1 
EQ_ _____________ 
12 
GU ______________ 
2 
AG ______________ 
6 
EH_ _____________ 
7 
GD ______________ 
1 


Al'-------------- 
4 
DE_ _____________ 
33 
EW ______________ 
7 
GG ______________ 
1 


AA-------------- 
3 
DA.. _____________ 
32 
EX_ _____________ 
7 
GJ ______________ 
1 
AW. _____________ 
3 
DL ____________ 
27 
EB-------------- 
4 
~------------- 
1 
Aff.. _____________ 
2 
DO ______________ 
16 
EG ______________ 
4 
GW • _____________ 
1 
AK.. _____________ 
2 
DT ______________ 
15 
EY-------------- 
4 
AO ____________ 
2 
DS~------------- 
13 
EU ______________ 
3 
AE.. ___________ 
1 
DR------------- 
12 
EJ ______________ 
1 
AJ ___________ 
1 
DD ______________ 
8 
EZ ___________ -- 
1 
HI-------------- 
33 
DF-------------- 
8 
HT-------------- 
28 
BE ____________ 
18 
DM..------------- 
5 
FO ______________ 
40 
HA_ _____________ 
20 
BY ______________ 
7 
op ______________ 
5 
FI ______________ 
39 
HE-------------- 
20 


BL----------- 
6 
DtJ ______________ 
5 
FF ____________ 
11 
HQ ______________ 
20 
BA ________ 
4 
oo ______________ 
4 
FT-------------- 
11 
HR_ _____________ 
17 
BO _____________ 
4 
DC ______________ 
4 
FE ______________ 
10 
HU ______________ 
8 
BI ___________ 
2 
DN ______________ 
4 
FR_ _____________ 
9 
HF-------------- 
5 
BR.. _____________ 
2 
ow ______________ 
4 
FA ______________ 
5 
HS ______________ 
4 
BU ______________ 
2 
DL.. _____________ 
3 
FS ______________ 
3 
HC ______________ 
3 
BJ ______________ 
1 
ov ______________ 
3 
FU ______________ 
3 
HN-------------- 
3 
BM__ ___________ 
1 
DG ______________ 
2 
FC ______________ 
2 
HD ______________ 
2 
BS ____________ 
1 
DH.. _____________ 
2 
FL-------------- 
2 
HM_ _____________ 
2 
BT ______________ 
1 
DQ ______________ 
2 
FD-------------- 
1 
HB------------- 
1 
DJ ______________ 
1 
FG ______________ 
1 
HL_ _____________ 
1 
co ______________ 
41 
DY ______________ 
1 
FM_ _____________ 
1 
HP ______________ 
1 
CE ______________ 
32 
FP ______________ 
1 
HQ ______________ 
1 
CA ______________ 
20 
EN ___________ -• 
111 
FW ______________ 
1 
HW-------------- 
1 
CH ______________ 
14 
ER.. _____________ 
87 
FY-------------- 
1 
HY ______________ 
1 


I For arrangement alphabetically first under intial letters and then under final letters, sl;le Table 6. 
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TABLE 8 Contd.-The 438 different digraphs of Table 6, arra;ngedfirst alphabetically acoording to 


their 


1initial letters, and then according to their absol'IJJe fre<J:IUncieB 'IJ/Tl.der each initial letter 


1 


IN ______________ 
75 
LI ______________ 
20 
NE ______________ 
57 
QA ______________ 
7 


IO _____________ 
41 
LO ______________ 
13 
ND ______________ 
52 
ov ______________ 
7 


IS ______________ 
35 
LY ______________ 
10 
NI~------------- 
30 
oo ______________ 
6 
IR.. _____________ 
27 
LO ______________ 
9 
NG ______________ 
27 
or ______________ 
5 


IT ______________ 
27 
LT. __________ 
8 
NA.. _____________ 
26 
OB------------ 
4 
IV ______________ 
25 
LS _____________ 
6 
NS _____________ 
24 
OE-------------- 
3 
IL ______________ 
23 
LB------------ 
3 
NC ____________ 
19 
OH_ _____________ 
3 


IC ______________ 
22 
LC _____________ 
3 
NO _____________ 
18 
QG _____________ 
2 


IG ______________ 
19 
LF _____________ 
3 
NF ______________ 
9 
OK. _____________ 
2 
IX.. _____________ 
15 
LP-------------- 
3 
NN ______________ 
8 
OY ______________ 
2 


IE ______________ 
13 
LM._ _____________ 
2 
NU ______________ 
7 
OJ _____________ 
1 


IF-------------- 
10 
LR.. _____________ 
2 
NI, ______________ 
5 
ox ______________ 
1 


IM_ _____________ 
9 
LU ______________ 
2 


NM_ ________ 
5 


·IA ______________ 
8 
LV ______________ 
2 
NY-------------- 
5 
PE ______________ 
23 


IP-------------- 
7 
LW _____________ 
2 
NH ______________ 
4 
PR.. _____________ 
18 
ID ______________ 
6 
LG ______________ 
1 
NR_ _____________ 
4 
PO ______________ 
17 
· m _____________ 
1 
NP ______________ 
3 
p A,_ _____________ 
14 


[8 ______________ 
2 
LN------------ 
1 
NV _____________ 
3 
PL-------------- 
13 
.IK ______________ 
2 
NW _____________ 
3 
pp ______________ 
11 
·rz ______________ 
2 
'MA_ _____________ 
36 
NB-------------- 
2 
PT ______________ 
8 


:ME_ ____________ 
26 
"NK_ _____________ 
2 
PI ____________ 
6 


JE ______________ 
2 
MM_ _____________ 
13 
NJ ______________ 
1 
PS ____________ 
6 


JO ______________ 
2 
MO ______________ 
10 
NQ ______________ 
1 
PM. _________ 
4 


JU ______________ 
2 
MI------------- 
9 
pH_ _____________ 
3 
J A.. _____________ 
1 
yp _____________ 
8 
ON ______________ 
77 
PU ____________ 
3 


MB------------- 
.6 
OR.. __________ .. __ 
64 
PF _____________ 
2 
KE ______________ 
6 
MS ____________ 
4 
ou _____________ 
37 
PB----------- 
1 
KI ______________ 
2 
MC ___________ 
3 
OF ______________ 
25 
PC ___________ 
1 
KA ______________ 
1 
,MR __________ 
2 
QM_ _____________ 
25 
PD ____________ 
1 


lKC ______________ 
1 
:MT ______________ 
2 
OP ______________ 
25 
PN ____________ 
1 


KL-------------- 
1 
MlJ ______________ 
2 
OL _____________ 
19 
PV ____________ 
1 
KN ______________ 
1 
J4Y ______________ 
2 
OT ______________ 
19 
PW _____________ 
1 
xs ______________ 
1 
MD-------------- 
1 
os ______________ 
14 
py _____________ 
1 


MF ______________ 
1 
LE-------------- 
37 
MH ______________ 
1 
OD ______________ 
12 
QU ____________ 
15 


LA_ ____________ 
28 
oc ______________ 
8 
QM ______________ 
1 


LL-------------- 
27 
NT ______________ 
82 
ow ______________ 
8 
QR ______________ 
1 


1 For arrangement alphabetically first under initial letters and then under final letters, see Table 6. 
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TABLE 8, Concluded.-The 438 different digraphs of Table 6, arranged.first alphabeti.cally according 
to their initial letters, and then according to their absolute frequencies under each initial letter 1 


RE ______________ 
98 
SR ______________ 
5 
us ______________ 
12 
XI ______________ 
2 
RT ______________ 
42 
SN ______________ 
4 
UT ______________ 
12 
XP ______________ 
2 
RA ______________ 
39 
SW ______________ 
4 
UE ______________ 
11 
XD ______________ 
1 
RS ______________ 
31 
SB ______________ 
3 
UG ______________ 
8 
XE ______________ 
1 
RI ______________ 
30 
SM ______________ 
3 
UL ______________ 
6 
XF ______________ 
1 


RO ______________ 
28 
SG ______________ 
2 
UA ______________ 
5 
XH ______________ 
1 
RD ______________ 
17 
SL ______________ 
2 
UI ______________ 
5 
XN ______________ 
1 


RP----------·--- 
13 
SK ______________ 
1 
UM ______________ 
5 
XO ______________ 
1 
RR ______________ 
1 1 
sv ______________ 
1 
UB ______________ 
3 
XR_ _____________ 
1 
RC ______________ 
9 
SY ______________ 
1 
UC ______________ 
3 
XS ______________ 
1 
RM ______________ 
9 
UD ______________ 
3 


RY ______________ 
9 
TH ______________ 
78 
UP ______________ 
2 
YT ______________ 
15 
RG ______________ 
7 
TE ______________ 
71 
UF ______________ 
1 
YF ______________ 
11 
RN ______________ 
7 
TO ______________ 
50 
uo ______________ 
1 
YS ______________ 
11 


·RF ______________ 
6 
.TI ______________ 
45 
uv ______________ 
1 
YO ______________ 
10 
RI.. ______________ 
5 
TY ______________ 
41 
YE ______________ 
9 
.RU ______________ 
5 
TW ______________ 
36 
VE ______________ 
57 
YA ______________ 
6 
. RV ______________ 
5 
TA ______________ 
28 
.vI ______________ 
12 
YN ______________ 
6 


'RW ______________ 
4 
TS ______________ 
19 
VA ______________ 
6 
YC ______________ 
4 


RH_ _____________ 
3 
TT ______________ 
19 
VO ______________ 
1 
YD ______________ 
4 
RB ______________ 
2 
TR_ _____________ 
17 
VT ______________ 
1 
YR ______________ 
4 
RJ ______________ 
1 
TF ______________ 
7 
YI ______________ 
3 


RK._ ___________ 
1 
TN ______________ 
7 
WE ______________ 
22 
yp ______________ 
3 
TC ______________ 
6 
WO ______________ 
19 
Yij ______________ 
2 


ST ______________ 
63 
TD ______________ 
6 
WI ______________ 
13 
YL ______________ 
2 


SE ______________ 
49 
™-------------- 
6 
WA ______________ 
12 
YM ______________ 
2 


SI ______________ 
34 
TL ______________ 
5 
WH_ _____________ 
4 
YG ______________ 
1 


SH ______________ 
26 
TU ______________ 
5 
WN ______________ 
2 
YH_ _____________ 
1 


SA ______________ 
24 
TB ______________ 
3 
WL ______________ 
1 
YU ______________ 
1 


SS ______________ 
19 
TP ______________ 
2 
WR_ _____________ 
1 
YW ______________ 
1 
so ______________ 
15 
TG ______________ 
1 
ws ______________ 
1 
sc ______________ 
13 
TQ ______________ 
1 
WY ______________ 
1 
ZE ______________ 
2 


SF ______________ 
12 
TZ~------------- 
1 
ZA ______________ 
1 
su ______________ 
1 1 
XT ______________ 
7 
ZI ______________ 
1 
sp ______________ 
10 
UR ______________ 
31 
XA ______________ 
2 


SD ______________ 
5 
UN ______________ 
21 
xc ______________ 
2 
Total__ ___ 5, 000 


1 For arrangement alphabetically first under initial letters and then under final letters, see Table 6. 
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TABLE 9-A.-The 438 different digraphs of Table 6, arranged first alphabetically according to their 
final letters, and then according to their absolute frequencies 


RA ______________ 
39 
EC ______________ 
32 
RE ______________ 
98 
GF ______________ 
2 


MA ______________ 
36 
IC ______________ 
22 
TE_ _____________ 
71 
PF ______________ 
1 
EA ______________ 
35 
NC ______________ 
19 
NE ______________ 
57 
CF ______________ 
2 
DA ______________ 
32 
AC ______________ 
14 
VE ______________ 
57 
MF ______________ 
1 


LA---------"---- 
28 
sc ______________ 
13 
SE_ _____________ 
49 
UF ______________ 
1 
TA ______________ 
28 
RC ______________ 
9 
EE ______________ 
42 
XF ______________ 
1 
NA ______________ 
26 
oc ______________ 
8 
LE ______________ 
37 


SA ______________ 
24 
·Tc ______________ 
6 
DE ______________ 
33 


CA __________ c ___ 
20 
oc ______________ 
4 
CE ______________ 
32 
NG ______________ 
27 


·HA ______________ 
20 
ye ______________ 
4 
ME ______________ 
26 
IG ______________ 
19 


PA ______________ 
14 
cc ______________ 
3 
PE ______________ 
23 
UG ______________ 
8 
WA _______ c ______ 
12 
HC ______________ 
3 
WE ______________ 
22 
RG ______________ 
7 
IA ______________ 
8 
LC ______________ 
3 
HE ______________ 
20 
AG ______________ 
6 
GA ______________ 
7 
MC ______________ 
3 
BE ______________ 
18 
EG ______________ 
4 


OA ______________ 
7 
uc ______________ 
3 
GE ______________ 
14 
DG ______________ 
2 


VA ______________ 
6 
·FC ______________ 
2 
IE ______________ 
13 
OG ______________ 
2 
YA ______________ 
6 
GC ______________ 
2 
UE ______________ 
1 1 
SG ______________ 
2 
FA ______________ 
5 
xc ______________ 
2 
FE ______________ 
10 
FG ______________ 
1 


.UA ______________ 
5 
KC ______________ 
1 
YE ______________ 
9 
GG ______________ 
1 


,BA ______________ 
4 
.PC ______________ 
1 
KE ______________ 
6 
LG ______________ 
1 
.AA ______________ 
3 
OE ______________ 
3 
TG ______________ 
1 
XA ______________ 
2 
JE ______________ 
2 
YG ______________ 
1 


JA ________ ------ 
1 
ED ______________ 
60 
ZE-------------- 
2 
KA ______________ 
1 
ND ______________ 
52 
AE ______________ 
1 


·ZA ______________ 
1 
AD ______________ 
27 
XE ______________ 
1 
RO ______________ 
17 
TH ______________ 
78 
AB ______________ 
6 
oo ______________ 
12 
SH ______________ 
26 


MB ______________ 
6 
LO ______________ 
9 
GH. _____________ 
20 
.I)s_ _____________ 
4 
oo ______________ 
8 
OF ______________ 
25 
CH ______________ 
14 


,EB ______________ 
4 
ID ______________ 
6 
EF ______________ 
18 
EH ______________ 
7 


-OB ______________ 
4 
TD ______________ 
6 
SF ______________ 
12 
NH ______________ 
4 
LB ______________ 
3 
so ______________ 
5 
FF ______________ 
11 
WH-------------- 
4 


SB .... ----"----- 
3 
YO ______________ 
4 
YF ______________ 
11 
OH. _____________ 
3 
TB ______________ 
3 
uo ______________ 
3 
IF ______________ 
10 
PH. _____________ 
3 


UB ______________ 
3 
HD ______________ 
2 
NF ______________ 
9 
RH-------------- 
3 
IB ______________ 
2 
co ______________ 
1 
OF ______________ 
8 
AH_ _____________ 
2 
.NB ______________ 
2 
FD ______________ 
1 
TF ______________ 
7 
DH. _____________ 
2 
RB ______________ 
2 
GD ______________ 
1 
RF ______________ 
6 
LH_ _____________ 
1 


·YB ______________ 
2 
MD ______________ 
1 
HF ______________ 
5 
MH ______________ 
1 
HB ______________ 
1 
,PD ______________ 
1 
AF ______________ 
4 
XH ______________ 
1 


PB ______________ 
1 
XD ______________ 
1 
LF ______________ 
3 
·YH ______________ 
1 


l 


-· ' 
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TABLE 9-A, Contd.-The 488 dijferem digraphs <1f Table 8, arramgedfirat al.phabeti.call:y according 


to their fioo/, letters, and then according to flurir abaol'l/Je jreg_wnciea 


TI ______________ 
45 
LL_ _____________ 
27 
AN ______________ 
64 
RP _____________ 
13 
Fl ______________ 
39 
IL_ _____________ 
23 
UN ______________ 
21 
AP ______________ 
12 
SI ______________ 
34 
OL ______________ 
19 
NN ______________ 
8 
pp ______________ 
11 
HI ______________ 
33 
PL ______________ 
13 
RN ______________ 
7 
SP ______________ 
10 
NI ______________ 
30 
BL ______________ 
6 
TN _____________ 
1 
MP ______________ 
8 
RI. _____________ 
30 
UL_ ________ ----- 
6 
m ______________ 
6 
IP ______________ 
7 
DI ______________ 
27 
CL ______________ 
5 
DN ______________ 
4 
OP _____________ 
5 
EI ______________ 
27 
NL-------------- 
5 
SN ______________ 
4 
LP ______________ 
3 
J.,J ______________ 
20 
RI._ ____________ 
5 
GN ______________ 
8 
NP •.•••••••••••• 
3 
AI ______________ 
17 
Tk ............. 
5 
HN ______________ 
3 
YP ----------· --- 
3 
WI ______________ 
13 
DL ______________ 
3 
.wN ______________ 
2 
GP ______________ 
2 
VI ______________ 
12 
FL ______________ 
2 
CN ______________ 
l 
TP -------------- 
2 
)4! ______________ 
~ 
QL _____________ 
2 
KN-------------- 
i 
UP ______________ 
2 


1CI ______________ 
1 
SL------------- 
2 
LN-------------- 
1 
XP------------ 
2 
. }>! ______________ 
6 
''YJ.. _____________ 
~ 
PN ______________ 
l 
· FP • _____________ 
1 


'.GI-----------:·r~, 
.6 
,fi.,_ ____________ 
1 
)[N ______________ 
l 
iiP ----------- 
1 
·or ______________ 
6 
~------------- 
'l 
EQ_ _____________ 
12 
OI ______________ 
5 
WL.. _____________ 
1 
·:To ______________ 
lSP 
DQ_ _____________ 
2 
YI ______________ 
"8 
'co ______________ 


4~ 
HQ_ _____________ 
1 
BI ______________ 
2 
. OM_ _____________ 
25 
IQ ______________ 
41 
NQ. _____________ 
1 
;KI ______________ 
2 
AM_ _____________ 
14 
FQ ______________ 
40 
'l'Q_ _____________ 
1 
XI ______________ 
2 
RO ___________ 
EM..---------- 
14 
28 
Zl ______________ 
1 
MM._ _____________ 
13 
HQ ______________ 
20 


ER_ _____________ 
87 


1'0 ______________ 


OR. _____________ 
64 
IM.. _____________ 
9 
19 
AR_ _____________ 
44 
AJ ______________ 
1 
RM_ _____________ 
9 
NQ ______________ 
18 
BJ ______________ 
1 
17 
UR.. ____________ 
31 
TM_ _____________ 
6 
pa ______________ 
IR. ___________ 


~7 
OJ-------------- 
l 
DM.. ___________ 
5 
oo _____________ 
16 
PR. ____________ 
18 
EJ ______________ 
1 
so ______________ 
NM_ _____________ 
5 
15 
HR_ _____________ 
1'1 
GJ ______________ 
1 
(JM_ _____________ 
5 
LO ______________ 
13 
NJ _____________ 
1 


TR_ ___________ 
17 
PM_ _____________ 
4 
EO ______________ 
12 
DR. _____________ 
12 
OJ------------- 
1 
SM _____________ 
3 
MO ______________ 
10 
RR.------------ 
u 
RJ ______________ 
1 
m _____________ 
2 
YO ______________ 
10 
f'R. ___________ 
9 
LM_ _____________ 
2 
GO ______________ 
6 
GR_ _____________ 
5 
CK. _____________ 
4 


YM_ ____________ 
2 
oo ______________ 
6 
SR. _____________ 
5 
AK. ______ ------- 
2 
BM.. _____________ 
1 
BO ______________ 
4 
CR. _____________ 
4 
IK. _____________ 
2 
CM.. _____________ 
1 
AO ______________ 
2 
NR_ _____________ 
4 
NK.. ____________ 
2 
FM.. _____________ 
1 
J Q _____________ 
2 
YR_ _____________ 
4 
OK. _____________ 
2 
GM.. _____________ 
1 
uo ______________ 
1 
BR. _____________ 
2 
.RK.. ____________ 
1 
QM.. _____________ 
1 
vo ______________ 
1 
LR_ _____________ 
2 
.SK. ___________ '" 
1 
xo ______________ 
1 
,MR_ _____________ 


~ 
EN ____________ 
111 
QR.. _____________ 
1 
AL.. _____________ 
32 
ON ______________ 
77 
op ______________ 
25 
WR_ ____________ 
1 
EL.. _____________ 
29 
IN ______________ 
75 
EP ••••••........ 
20 
m_ _____________ 
l 
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TABLE 9-A, Co~clud~d.-:-Tke 1,88 dijftrMt di,graphs of TabU 81 arranged first al,pha~™ally 
according to their final letttra, and then according lo tMi'I' absol'l.de frequencies 
ES ______________ 
54 
OT ______________ 
19 
JU ______________ 
2 
PW ______________ 
1 
AS ______________ 
41 
TT __________ ---- 
19 
LU ______________ 
2 
YW ______________ 
1 
rs ______________ 
35 
DT ______________ 
15 
MU ______________ 
2 
RS ______________ 
31 
YT ______________ 
15 
YU ______________ 
1 
IX.. _____________ 
15 
NS ______________ 
24 
CT ______________ 
14 
EX_ _____________ 
7 
ss ______________ 
19 
UT ______________ 
12 
IV ______________ 
25 
ox.. _____________ 
1 
TS ______________ 
19 
FT ______________ 
11 
EV ______________ 
20 
OS ______________ 
14 
LT ______________ 
8 
AV ______________ 
7 
TY ______________ 
41 
DS ______________ 
13 
PT ______________ 
8 
ov ______________ 
7 
AY ______________ 
12 
us ______________ 
12 
XT ______________ 
7 
RV ______________ 
5 
LY ______________ 
10 
YS ______________ 
11 
GT ______________ 
4 
DV ______________ 
3 
RY ______________ 
9 
LS ______________ 
6 
MT ______________ 
2 
NV ______________ 
3 
BY ______________ 
7 
PS ______________ 
6 
BT ______________ 
1 
LV ______________ 
2 
NY ______________ 
5 
HS ______________ 
4 
VT ______________ 
1 
PV ______________ 
1 
EY ______________ 
4 
MS ______________ 
4 
sv ______________ 
1 
MY ______________ 
2 
FS. _____________ 
3 
ou ______________ 
37 
uv ______________ 
1 
oy ______________ 
2 
GS ______________ 
3 
QU ______________ 
15 
CY ______________ 
1 
BS ______________ 
1 
AU·---------~-- 
13 
TW ______________ 
36 
DY ______________ 
1 
cs ______________ 
1 
su ______________ 
11 
ow ______________ 
8 
FY. _____________ 
1 
KS ______________ 
1 
HU _____________ 
8 
EW _____________ 
7 
HY ______________ 
1 
ws ______________ 
1 
NU ______________ 
7 
DW ______________ 
4 
py ______________ 
1 
XS ___ ----------- 
I 
DU ______________ 
5 
RW ______________ 
.4 
sy ______________ 
1 
RU ______________ 
5 
SW ______________ 
4 
WY ______________ 
1 
NT ______________ 
82 
TU ______________ 
5 
AW-------------- 
3 
ST ______________ 
63 
cu ______________ 
4 
NW ______________ 
3 
Iz.. _____________ 
2 
AT ______________ 
47 
EU ______________ 
3 
LW ______________ 
2 
EZ ______________ 
1 
RT ______________ 
42 
FU ______________ 
3 
cw. _____________ 
1 
TZ ______________ 
1 
ET ______________ 
37 
PU ______________ 
3 
FW ______________ 
1 
HT ______________ 
28 
BU ______________ 
·2 
GW ______________ 
1 
Total ____ 5, 000 
IT ______________ 
27 
GU ______________ 
2 
HW ______________ 
1 


148274--38-9 
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TABLE 9-B.-The 18 digrap/uJ composing So% of ths 6;000 digraphs of Table 6, arranged alpha- 
betically according to their final letters-- 


(1) AND ACCORDING TO THEIR INITIAL 


LETTims 


ED _____________ _ 
ND _____________ _ 


NE _____________ _ 
RE _____________ _ 
SE _____________ _ 
TE _____________ _ 
VE _____________ _ 


TH_ ____________ _ 


60 
52 


57 
98 
49 
71 
57 


78 


AN _____________ . 
64 
, EN______________ 
111 


IN _____________ _ 
ON _____________ _ 


TO _____________ _ 


)!!R. ____________ _ 
OR. ____________ _ 


ES _____________ _ 


NT _____________ _ 
ST _____________ _ 


75 
77 


50 


87 
64 


54 


82 
63 


Total_ ____ 1, 249 


(2) AND ACCORDING TO THEIR ABSOLUTE 
FREQUENCIES 


ED _____________ _ 


ND.------------- 


RE _____________ _ 
TE _____________ _ 
NE _____________ _ 
VE _____________ _ 
SE _____________ _ 


TH. ____________ _ 


60 
52 


98 
71 
57 
57 
49 


78 


EN______________ 
111 
ON______________ 
77 


IN _____________ _ 
AN _____________ _ 


·TO _____________ _ 


ER _____________ _ 
OR. ____________ _ 


E$ _____________ _ 


. NT _____________ _ 
ST _____________ _ 


75 
64 


50 


87 
64 


54 


82 
63 


Total__ ___ 1, 249 


TABLE 9-C.~The 53 digraphs composing 50% of the 5,000 digraphs of Table 6, arranged 


OJ,phabetically according to their final letters-- 


l:>A. ____________ _ 
.EA _____________ _ 
LA _____________ _ 


MA_ ____________ _ 


'RA_ ____________ _ 
TA. ____________ _ 


EC _____________ _ 


ED _____________ _ 
ND _____________ _ 


CE _____________ _ 
DE _____________ _ 
EE _____________ _ 
LE _____________ _ 


NE ___ ----------- 


(1) 


32 
35 
28 
36 
39 
28 


32 


60 
52 


32 
33 
42 
37 
57 


AND ACCORDING TO THEIR INITIAL LETTERS 
RE _____________ _ 
SE _____________ _ 
TE _____________ _ 
·vE _____________ _ 


TH _____________ _ 


FI _____________ _ 
HI _____________ _ 
NI _____________ _ 
RI _____________ _ 
SI _____________ _ 
TI _____________ _ 


AL _____________ _ 
EL _____________ _ 


AN _____________ _ 


98 
49 
71 
57 


78 


39 
33 
30 
30 
34 
45 


32 
29 


64 


EN _____________ _ 
IN _____________ _ 
ON _____________ _ 


co _____________ _ 
FO ____________ _ 
IO _____________ _ 
RO _____________ _ 
TO _____________ _ 


AR. ____________ _ 


ER_·----~------- 
OR. ____________ _ 
UR. ____________ _ 


AS _____________ _ 
ES _____________ _ 


lll 


75 
77 


41 
40 
41 
28 
50 


44 
87 
64 
31 


41 
54 


ls _____________ _ 
:Rs _____________ _ 


AT _____________ _ 
ET _____________ _ 
HT _____________ _ 


NT---····------- 
RT ____________ _ 
ST _____________ _ 


ou _____________ _ 


TW _____________ _ 


TY _____________ _ 


35 
31 


47 
37 
28 
.82 
42 
63 


37 


36 


41 


TotaL____ 2, 495 


l 
l 
I 


-~ 


1 i 
i 
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T .A.BI!]jl .9--0, Concluded.--The 58 di.gra,ph,8 compotring 50% 'Of the 5,000 digr(Jl]JhB of Table 8, 
arranged alphabetically according, to their final lettera-- 


JlA _____________ _ 
MA.. ____________ _ 
EA _____________ _ 
DA.. ____________ _ 


LA _________ ---·· 
TA_ ____________ _ 


EC _____________ _ 


EP------·------· 
ND _____________ _ 


RE _____________ _ 
Ti: _____________ _ 
NE ____________ _ 


VE.7 -~"·-------- 
. 
SE _____________ _ 
EE _____________ _ 


(2) AND ACCORDII':W TO THEIR ABSOLUTE FREQUENCIES 
3 9 
LE._____________ 
3 7 
ON______________ 
7 7 
IS______________ 
3 5 


3 6 
OE______________ 
3 3 
IN______________ 
7 5 
RS-----------·-- 
31 


3 5 
CE.............. 
32 
AN______________ 
64 


'3 2 
NT-------------- 
8 2 
28 
™-------------- 
'l~ 
TO______________ 
50 
$T______________ 
63 


28 
co______________ 
41 
AT______________ 
41 


~! ______________ 
45 
32 
FJ; ______________ 
39 
60 
SI. _____________ 
34 
52 
HI ______________ 
33 
NI ______________ 
30 
98 
RI----------·--- 
30 


71 
57 
AL. _____________ 
~2 


57 
EL. _____________ 


2~ 


49 
42 
. ·m-~------------ 
111 


. . 
' 


IQ______________ 
41 
RT______________ 
42 
FO______________ 
40 
ET______________ 
3 7 
RO______________ 
28 
HT______________ 
28 


ER. ____________ _ 
OR. ____________ _ 
J\R _____________ _ 
UR. ____________ _ 


ES _____________ _ 
AS _____________ _ 


87 
64 
44 
;H 


54 
41 


·,,, 
ou _____________ _ 
87 


TW ·----------··· 
36 


TY______________ 
41 


TotaL.___ 2, 495 


TABLE 9-D.--.The 117 digraphs composing 75% oj the 5,000 digraphs of Table 6, arranged 


'· 
<ilphabetically accor~ing to thsir final letters-- 
.. , 


(1) AND ACCORDING TO THEIR INITIAL LETTERS 
CA ______________ 
20 
ND ........•..•.• 
52 
EF ·-----··--·--- 
18 
$! ______________ 
34 
DA. _____________ 
32 
RD----------···· 
17 
OF------··--·--· 
25 
TI ______________ 
45 


EA_ _____________ 
35 


HA_ _____________ 
20 
BE. _____________ 
18 
IG ______________ 
19 
AL ______________ 
32 
LA_ _____________ 
28 
CE------------·· 
32 
NG ____________ 
27 
EL. _____________ 
29 
),{A_ _____________ 
36 
DE ______________ 
33 
IL ______________ 
23 
NA. _____________ 
26 
EE_ _____________ 
42 
CH. _____________ 
14 
LL_ _____________ 
27 
p A_ _____________ 
14 
GE_ _____________ 
14 
Gfl _____________ 
20 
OL ______________ 
19 


~------------- 
39 
HE------------·· 
20 
SH. ___________ 
26 


SA..------------~ 
24 
re: ______________ 
13 
TH. _____________ 
78 
AM. _____________ 
TA. _____________ 
28 
LE ______________ 
37 
14 
ME ______________ 
26 
AI ______________ 
17 


EM_ _____________ 
14 


' 
DI----···---·--- 
~7 
QM_ _____________ 
25 
AC ______________ 
14 
NE ______________ 
57 
EC ______________ 
32 
PE ______________ 
23 
EI ______________ 
27 


:tc ______________ 
22 
RE_ _____________ 
98 
FI ______________ 
39 
AN ______________ 
64 
NC ______________ 
19 
SE ______________ 
49 
HI ______________ 
33 
EN ______________ 
111 
TE _____________ 
71 


~I ______________ 
20 
IN ______________ 
75 
Ao ______________ 
27 
VE ______________ 
57 
NI ______________ 
30 
ON ______________ 
77 
ED ______________ 
60 
WE_ _____________ 
22 
RI ______________ 
30 
UN ______________ 
21 
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T.Anm9-D, Oontd.-77w 111 iif"aphi cDmJHWi.1lfl 76%.of IA,, 6,00011.igrapA..rojTdk I, arranged 
a.l~lig iuoording fo .tliieifo jiMl lslf.r,..._ 


(I) AND ACCORDINCl TO THEllt INITIAL Ll!JTTERS--'-Contlnued 
co ______________ 
41 
AR.. _____________ 
44 
os ______________ 
14 
YT _____________ 
15 
t>O ______________ 
16 
TR_ _____________ 
17 
![$ ______________ 
as 
AU-----------··· 
13 
FO _____________ - 
40 
tJlt _____________ 
81 
it$ ______________ 
31 


HO ______________ 
20 
ER_ _____________ 
87 
ou ______________ 
17 
IO ______________ 
-41 
OR _____________ 
64 
AT ______________ 
47 
QU------------· 
15 
t.o ______________ 
18 
PR. _____________ 
18 
CT ______________ 
14 
EV ______________ 
20 
Nb ______________ 
18 
H1l _____________ 
17 
1)1' ______________ 
15 
iv ______________ 
25 


PO--------~----- 
17 
IR. _____________ 
27 
ET ______________ 
81 
RO ______________ 
28 
M'f ______________ 
28 
''--------~----- 
96 
so ______________ 
15 
AS ______________ 
41 
IT ______________ 
27 
to ______________ 
so 
ss ______________ 
19 
NT ______________ 
82 
;i:x_ _____________ 
15 
wo ______________ 
19 
T$ ______________ 
l9 
OT ______________ 
19 


j ... _ 
f)g ______________ 
-ta 
RT ______________ 
42 
'!'! ______________ 
41 
EP ______________ 
20 
!:$ ______________ 
54 
st ______________ 
~a 
()p ________ ~; ____ 
-16 
NS ______________ 
24 
lf'r ______________ 
19 
· · Total _____ 3, HS 


i 
': 
,·(2) AND :4CCORDING TOIJ'HEIR ~,.\BSOLUTE FR~!J"ENCIE.~ 


RA_ _____________ 
,39 
TE------------ 
.71 
™-------------- 
78 
~--...--------- 
14 
MA __ : ___ ~----·--~ 
3'6 ' 
'm:~~---~~--~~- 
57 
t~~::~':·~:~:~~\ :1 :·: :26 
J!:M_--~---------- 
14 


EA ______________ 
35 
VE __ : _____ ~~~-~2" 
"th 
l.\ 
"'~o· 


DA ______________ 
32 
~· : SE~"------·-"--- 
::i:9· 
CH....~·---···--- 
14' 
EN ______________ 
111 
LA ______________ 
28 
SE-------------- 
42 
TI ______________ 
45 
Olf ______________ 
77 
T.A ______________ 
28 
LE----------~--- 
37 
NA ______________ 
2& 
[)! ______________ 
33 
FI ______________ 
3D 
IN _____ ; ________ 
76 


S.A ______________ 
24 
CE ______________ 
32 
SI ______________ 
34 
A)f ______________ 
64 


CA ______________ 
20 
·-------------· 


26 
HI ______________ 
33 
UN------------ 
u 


HA_ _____________ 
20 
P!.. _____________ 
23 
NI ______________ 
30 


p A_ _____________ 
1' 
WE ____ : _________ 
22 


Rt ______________ 
30 
TO ______________ 
50 
H!: ______________ 
20 


DI ______________ 
27 
co ______________ 
41 
EC ______________ 
32 
EI ______________ 
27 
I 0-------------- 
41 
IC ______________ 
22 
151!: ______________ . 
18 
LI ______________ 
20 
FO ______________ 
40 
NC ______________ 
19 
Gl!: ______________ 
14 
AI ______________ 
17 
RO ______________ 
28 
Ac ______________ 
14 
IE_ _____________ 
13 
AL_ _____________ 
39 


HO ______________ 
20 
ED ______________ 
60 
EL. _____________ 
29 
wo ______________ 
19 
ND ______________ 
52 
or ______________ 
25 
laL-------------- 
~7 
NO ______________ 
18 


AO.. _____________ 
27 
EP'---~---------- 
18 
IL. _____________ 
23 


PO ______________ 
17 
RO ______________ 
17 
OL. _____________ 
19 
oo ______________ 
16 
NG ______________ 
27 
so ______________ 
15 


RE-------------- 
98 
IG ______________ 
19 
OM ______________ 
25 
LO ______________ 
n 


I l 


I I 
fl 
f 1· 
1 


I 
. l 
i I 
j 


I 


4.' 
i. 
i 


TABLB 9-D, Conclu.ded1~fi4 111 digrapM ·compoling 76% oj, the 6/)00 digraphs ~ ·T~ ITi 
arran1ed ldph®etically ~ng 
to twfifllJl kUers 
" 


';. •, 
> 
.\ 
\i} 
''. 
,·, 


(2) AND ACCORDING TO THEIR ABSOLUTE FREQUENCIES-Continued 


'F:f 


OP-------------- 
25 
Ji:S ______________ 
54 
AT ______________ 
47 
QU ______________ 
16 
EP ______________ 
20 
AS ______________ 
41 
RT ... ----····-·- 
42 
AU ______________ 
13 
IS ______________ 
35 
ET-------······- 
37 
RS ______________ 
31 
HT ______________ 
28 
IV ______________ 
25 


ER_ _____________ 
87 
NS _____________ 
24 
IT ______ .. ., ....... "" ... 
27 
EV-------------· 
20 
OR.. _____________ 
64 
$f) ______________ 
19 
OT _____________ 
19 
.AR _____________ 
44 
TS _____________ 
19 
TT-------·····-- 
19 
TW ______________ 
36 


UR_ _____________ 
31 
0$ ______________ 
14 
DT _____________ 
15 
IX.. _____________ 
115 
IR. •••..•... ____ 
27 
DS ______________ 
13 
YT-------·····-- 
15 
PR._. ___________ 
18 
CT ______________ 
14 
TY-------------- 
41 
HR._. ___________ 
17 
NT ______________ 
82 


TR.------------· 
17 
ST _____________ 
63 
ou __________ . __ 
37 
Total_ ____ 3, 7.ij 


TABLE 9-E.-All tht J,31 dijferem digraphs of Table 6 arranged alphabeti,cally first according to 
their final letters atul, thtn according to IAt\r initial letters 


(SEE TABLE ·6;-READ DOWN THE COLUMNS)'' 


TABLE 10-A.-Tht 66 lrigraph8 appearing 100 or more times in tht 60;000 letters of Governmmt 
plain-ted ltlegrams arrwnged according to thftr 11bsol'lde fNf.1Uncies 


ENT__________________________ 5 6 9 
I ON._________________________ 2 6 0 


AND-------------------------- 2 2 8 
ING.________________________ 226 
IVE·--------·--------------· 2 2.5 
T.I 0-------------···----"--- 2 21 


FOR.._---------------~------- 218 
OUR._________________________ 211 
THJ:.._________________________ 211 


ONE._________________________ 210 
NIN..________________________ 201 


STO·------------------------- 20:il 
EEN._________________________ 19 6 


GHT·-----------------------·- 196 
INl!i. •... --------------------- 192 
VEN-------------------------- 19 O 
Ji:VE_________________________ 177 
EST·------------------------- 17 6 
TEE.----------------------- 1 7 • 


TOP__________________________ 174 
NTH__________________________ 17 i 


TWE__________________________ 1 7 0 


TWO .•... ·-------------------- 163 
ATI------------------------- 16 0 
THR_________________________ 158 
NTY__________________________ 15 7 
HRE__________________________ 1 li8 
VJiN__________________________ 163 


FOU__________________________ 152 


ORT·------------------------- 146 
REE__________________________ 14 6 
SIX._________________________ 146 
ASH_________________________ 143 


DAS__________________________ 14 0 
IGH._________________________ 14 0 
ERE._________________________ 138 


COM.._________________________ 13 6 
ATE---------·---------·-····· 136 


EIG-------------------------- 135 
m ________ : ___________ :_____ 13 s- 
MEN__________________________ 131 
SEV__________________________ 131 


ERS---------··---------····-· 1~ 6 
UNO------------------·------· 12 5 
NET·------------------------ . 118 
PER _________________________ . 115 


STA._________________________ 1 U 
TER__________________________ 115 
EQU__________________________ lH 
RED__________________________ 113 
TED__________________________ 112 · 
ERI__________________________ 109 


HlR__________________________ 106· 
IRT ______________________ "___ 106 


DER------------------------- 10 l · 
OREL________________________ l QO 


180· 


TA:ei.:i: 10-':J3. "l'ke"68 ~ap'M appearing 100 or m<m times i'li 'the '.001000· lttitrs of Govemment 


plain-text telegrams atranged firlfl alph.abeticaUv according . to their initial letters and then 
according to their absolute freg:uencies 


AND__________________________ 228 
GHT__________________________ 196 
ATL________________________ 160 
ASH__________________________ 14 3 


ATE _______ ·-------------·--·- 13 5 


COM.._________________________ 13 6 


o.As__________________________ 14 o 


DER----------------------~~- 
1 0 1 : 


~RE------------------------~- 10 0 ·. ' 


ENT-------------------------- 5 6 9 
Em~-~-: __ : _____ ~~~~~~-:______ 19 6 • · .. 
EVE.. .. , ______________ ._....... J. 7.1 · 
:&:ST( , ' \, .. ,•,. 
' r 
'·P J 
I·,., 
I~ 


~----·-----------····-~----- . f1, ()' ' •; . 
ERE__________________________ 138 


EIG-------------------------- 13'6 
ER$__________________________ 12 6 


EQU: ___ :_c_~~---~~--~---~---- 11 ·4 
ERI ___________________ :~~---· 1'09 


(OR-------------------------- 21.8 
rou__________________________ 152· 


FIV-------------------------- 13 5 


HRE__________________________ 15 3 


HIR_________________________ 106 


ION__________________________ 2 6 0 
ING__________________________ 2 2 6 
IVE-------,------------------- 2 2 5 
INE _______ '""··--------------- 192 


IGR------.-~---------------- 140 


!RT·------~-------------~---- 10 5 


·MEN _______ ~:~~--------·:-~~-- 131 
\ \·~::::::::::::::::::=::::~' ~~~ .; . 
. N'l'Y~-----~-~~~-'--~---~ .. ~----· 15 7 . 
NET·------------------------- 118 


' 
, 
' 
• 
I 
'• I' 


.. OUR..---~--'.:~----·-~_: __ : _____ · 211 
ONE..-----------------------·- 2 lO 
ORT._________________________ H6 


PER. ______________ ----------- 115 


REE.._________________________ 14 6 


_RED--------~----------------- 113 


· STO _________________________ _ 


' SIX..-----'-•-'----------------- 
SEV ______ ~----------------~-- 
.. STA.. _____ ,.,,. ________________ _ 


202 
146. 
13.1; 
115 


TIO_________________________ 221 · 
THL ____ ;_ .. _________________ 21:1 
· TEE-----~~-~----------------- 174 .. 
• 'TOP------•-•----------------- 174 . 
TWE__________________________ 1 7 0 


. ' TWO~----~~-~~C~•---~-"·---~-- 1'63 
. THR..:________________________ 15 8 


TER------------------------- 115 
TED__________________________ 112 


. UND __________ c_______________ 12 5 


VEN-----~-------------------- 190 


WEN-------------------------- 153 


TA'IILE 10-C.-The 56 fridraphs appearing 100 or more times in the 50,000 letters of Governmtnt. 
· · plain-text telegramlJ iaJ.ranged fir8t alphabetically accord~ng to theit central, letters and the'fl, 


: according to their abirolute jrequen~s 


DA$__________________________ 140 


BEN__________________________ 196 
VEN__________________________ 190 


TE:E.._________________________ 17 4 
WEN__________________________ 153 


REE.._________________________ 14 6 
MEN__________________________ 131 


SEV__________________________ 131 
NET·-------·-----•----------- 118 
PER----------------------- 115 
TER.-----------·------------ 115 
RED------------------------- 113 
TED-----------------··-----·· 112 


rjEI(_________________________ 101 · 


IGH_ ______________ ----------- 14 0 


THI-------~---------------___ 211 
GHT •.. :---------------------- 196 
THR------------------------- 15 8 


TI 0-------------------------- 2 21 


NIN__________________________ 2 07 


SIX._________________________ 14 6 
EIG__________________________ 13 5 
FIV __________________________ 13 5 


Hiit_________________________ H) 6 


EMT~------------------------- S69·· 
.ANO _________________ ::_______ 2 2 s • 
ING ________ : _________________ · 226: 


ONE__________________________ 210 
INE ______ :___________________ 192· 
UNO ____ :_____________________ 12 q 


ION__________________________ 260 
FOR_________________________ 218· 


TOP·--------·---------------- 174 
FOU__________________________ 152 


COM.._________________________ 13 6 


I 


~ I 
' 
( 


j 


.;, I 
11 


1111111 1111111 1111111 1111111 1111111 1111111 1111111 1111111 1111111 1111111 1111111 1111111 1111111 1111111 1111111 1111111 111 ~Ill 1111111 1111111 1111111 1111111 1111111 111(11 


181 


TABLE 10-C, Concluded.~The 66 trigrams appeariw,g 100 <>1' more times in the 50,000 letters· of 


Government plain-text telegrams arranged first alphabetieally according to their central letterB 
and then according to their absolute frequencies 


EQU__________________________ 114 


HRE__________________________ 153 
O:R.T __________________________ 14 6 


ERE--------------------------· 138 
ERS .. ----·------------------- 12 6 
ERL________________________ 109 
IRT__________________________ 105 


DRE__________________________ 100 


EST__________________________ 176 
ASH__________________________ 14 3 


STO__________________________ 2 0 2 
NTH__________________________ 1 71 


ATL________________________ 16 0 
NTY __________________________ 15 7 


ATE__________________________ 13 5 


STA__________________________ 115 


OUR__________________________ 
~ 11 


IVE·----------------------- 2 2 5 
EVE _______________________ . 177 


TWE__________________________ . :r" 7 0 
TWO __________________________ ' 16 3. 


TABLE ~0-D.-The 56 trigraphs appearing 100 or more times in the 60,000 letters of Government 
plain-text telegramB arranged.first alphabetically according to their final, letters and then according 
to their absolute jreg:uencies 


STA.._________________________ 115. 


AND__________________________ 2 2 8 


UNO ... ----------------------- 12 5 
RED _________ :_______________ 113 


TEO__________________________ 112 


IVE__________________________ 22 5 
ONE__________________________ 21 O 
INE ... ---------------------- 19 2 
EVE.._________________________ 1 7 7 
TEE__________________________ 1 7 4 
TWE__________________________ 1 7 0 
HRE__________________________ 15 3 
REE__________________________ 14 6 


ERE-------------------------- 13 8 
ATE.._________________________ 13 5 
DRE.._________________________ 100 


ING__________________________ 2 2 6 · 
EIG .... ---------------------- 13 5 


NTH__________________________ 1 71 


ASH. •.• ----·--·--·----------- 14 3 


IGH__________________________ 140 


THL________________________ 211 
ATL______________________ __ 16 0 
ER! ....• _____________________ I 09 


COM.. ________________ --------- 136 


ION__________________________ 2 60 


NIN·---····------------------- 2 O 7 
EEN __________________________ . 19 6 


VEN__________________________ · 190 
WEN__________________________ 153 
MEN__________________________ 131 


TIO__________________________ 221 
STO__________________________ 202 
TWO__________________________ 163 


TOP ____________________ ·______ 17 4 


FOR_________________________ 218 
OUR_________________________ 21 I 
THR_________________________ 15 8 
PER._________________________ 115 


TER.__________________________ 11 ·5_ 


HIR-------------·----------- 106 
DER_________________________ 1 01 


DAS---~--------~------------- 14 O 
ERS__________________________ 12 6 


ENT ____________________ -----· · 5 6 9 


GHT__________________________ 19 6 
EST __________________________ . 17 6 


ORT__________________________ .14 6 
NET__________________________ 118 
IRT __________________________ . 105 


FOU__________________________ 152 


EQU__________________________ 114 


FIV__________________________ 13 5 
SEV__________________________ 13 .1 


SIX._________________________ 146 


NTY-------------------------- 15 7. 


; . 
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TABLB 11-A.-TA.e 64 tetragraphl apPMJT'ing 60 or more times in the 60,000 kttera of Go'emm1nl 
plain-tezt telegrams arranged according to t!W absolute fr~ 


TI ON________________________ 218 
EVEN________________________ 168 
TEEN________________________ 16 3 
ENTY ____________ ------------ 161 
STOP________________________ 154 
WENT------------------------ 15 3 
NINE........................ 15 3 
TWEN·-----------··---------- 15 2 
THRE________________________ 149 


FOUR----------------------- 144 
IGHT ________________________ 140 


FIVE._______________________ 13 5 
HR&E.--··---·--···--···---·· 13 4 
EIGH. •••••.•• ---·····---- · -13 ~ 
DASH.._______________________ 13 2 
SEVE________________________ 121 
J!:NTH._______________________ 11:.4 
M£NT________________________ 111 


' 


THIR._______________________ 104 
EENT ________________________ 1O2 


REQU________________________ 
9 8 
HIRT________________________ 
97 


COMM.._______________________ 
93 
QUES________________________ 
8 7 


UEST ._______________________ 
8 7 


EQUE.._______________________ 
86 
NDRE________________________ 
7 7 


OMMA.._______________________ 
71 
LLAR._______________________ 
71 


OLLA...~--------------------- 
7 0 
VENT·--··-·-·--···-··-··-··· 
7 0 
DOLL........................ 
68 
LARS------------------------ 
6 8 
THIS._______________________ 
6 8 


PERL______________________ 
6 7 


ERIO._______________________ 
66 


ASHT________________________ 
6:4 


HUND________________________ 
64 


DRED------------------------ 
6 3 
RIOO________________________ 
63 
IVEo_______________________ 
6 2 


Ms_.______________________ 
6 2 
FFIC________________________ 
6 2 
FROM.._______________________ 
5 9 
IRTY ________________________ 
5 9 


RTEE________________________ 
59 
UNDR.._______________________ 
59 


NAUG________________________ 
5 6 


OURT·----·-·-------·--···--· 
56 


UGHT ·-····-··-·············· 
5 6 


STA'l'---------------··-· 
54 
AUGH._______________________ 
5 2 
CENT________________________ 
5 2 


FICE·----------------------- 
5 0 


TABLE 11-B.-The 54 tetragraphs appearing 50 or more times in the 60,000 letters of Gov,.,.,.,_ 
ment plain-text telegrams, arra114ed. first alphabetically according to "t/i,N initial letters, and th#m 
· according to their aliaolute frequencies 


ASHT------------------------ 
64 
HREE ________________________ 134 
REQU ________________________ 
98 


AUGH.. ..•.. ·------·--········ 
52 
HIRT ________________________ 
97 
RIOD ________________________ 
63 


HUND ________________________ 
64 
RTEE.. _______________________ 
59 
COMM._ _______________________ 
93 


CBNT .... _______ ------------- 
52 
IGJi'l' ______ ------------------ 140 
STOP .•...... ---------------- 154 
IVED ________________________ 
62 
SEVE •• ---------------------_ 121 
DASJ-l •.••••••.•.•••.•.•.•.•. 132 
IRTY ________________________ 
59 
STAT _______________________ . 
54 
DOLL. .. _____________________ 
68 


DREJ) ________________________ 
68 
LI.AR.. _______________________ 
'71 
LARS ________________________ 
68 
TION ________________________ 218 
EVEN ________________________ 168 
TEEN ________________________ 163 


ENTY _____________ ----------· 161 
MENT ___________ ------------- 111 
'tWEN ________________________ 152 
EIGH. _______________________ 132 
THRE ________________________ 149 


ENTH_ _______________________ 114 
NINE ________________________ 153 
THIR. _______________________ 104 


EENT------------------------ 102 
NORE._ _______________________ 
77 
THls ________________________ 
68 
eQUE. _______________________ 
86 
NAUG ________________________ 
56 
ERI O ___ --------------------- 
66 
UEST ..... -----------------.. 
87 
EN'l'S ________________________ 
62 
OMMA.. _______________________ 
71 
UNDR. _______________________ 
59 
OLLA. _______________________ 
70 
UGHT ________________________ 
56 
FOUR.. _______________________ 144 
OURT ________________________ 
56 
FIVE ________________________ 135 
FFI C ______ ------------------ 
62 
PERI ________________________ 
67 
VENT ..... ---------------____ 
70 
FROM. _______________________ 
59 
FICE.. _______________________ 
50 
QUES ________________________ 
87 
WENT ________________________ 153 
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TABLE 11-C.-The 64 tetragrap/t.s appearing 60 or more times in the 50,000 letters of Government 
plain-text telegrams arranged first alphabetically according to their second letters and then 
according to their absolute frequencies 


DASH._______________________ 13 2 
LARS________________________ 
6 8 
NAUG________________________ 
56 


NDRE ... -------------------·- 
7 7 


TEEN________________________ 16 3 
WENT________________________ 15 3 
SEVE.._______________________ 121 
MENT________________________ 111 
EENT ________________________ 10 2 


REQU________________________ 
9 8 


UEST ________________________ 
8 7 
VENT________________________ 
7 0 


PERL______________________ 
6 7 
CENT________________________ 
52 


FFIC________________________ 
62 


IGHT________________________ 140 
UGI-t'r ____________________ ·--- 
5 6 


THRE.._______________________ 14 9 
THIR.._______________________ 104 


THIS________________________ 
68 


TION________________________ 218 
NINE________________________ 153 
FIVE________________________ 13 5 
EIGH.._______________________ 13 2 
HIRT ___ ~-------------------- 
9 7 
RIOD________________________ 
6 3 · 


FICE.._______________________ 
5 0 


LLAR.·---~------------------ 
71 
OLLA________________________ 
7 0 


OMMA________________________ 
71 


ENTY________________________ 161 
ENTH.._______________________ 114 
ENTS _______________________ . 
62 


UNDR._______________________ 
59 


FOUR________________________ 144 
COMM._______________________ 
9 3 
DOLL________________________ 
6 8 


EQUE________________________ 
86 


HREE________________________ 134 
ER! O________________________ 
6 6 
DRED________________________ 
63 
FROM._______________________ 
59 
IRTY ________________________ 
59 


ASHT________________________ 
64 


STOP________________________ 154 


RTEE·------------------~---- 
5 9 
STAT________________________ 
54 


QUES ___ ·-------------------- 
8 7 
HUND________________________ 
64 


OURT·----------·-----------· 
56 
AUGH________________________ 
52 


EVEN________________________ 168 
IVED________________________ 
62 


TWEN________________________ 15 2 


TABLE 11-D.-The 54 tetragraphs appearing 50 or more times in the 50,000 letters of Government 


plain-text telegrams arranged first alphabetically according to their third letters and then according 
to their absolute frequencies 


LLAR._______________________ 
71 


STAT________________________ 
54 


FICE._______________________ 
5 0 


UNDR________________________ 
59 


EVEN ______________________ .. 168 


TEEN________________________ 16 3 
TWEN________________________ 15 2 
HREE________________________ 13 4 
QUES________________________ 
8 7 
OREO________________________ 
6 3 


IVED________________________ 
6 2 
RTEE------------------------ 
5 9 


EIGH.._______________________ 132 
AUGH.._______________________ 
52 


IGHT________________________ 140 
ASHT ... --------------------- 
64 
UGHT________________________ 
56 


THIR.._______________________ 104 
THIS________________________ 
68 
ERIQ________________________ 
6 6 
FFIC________________________ 
6 2 


OLLA________________________ 
7 0 


DOLL________________________ 
68 


COMM________________________ 
98 


OMMA. ______________________ • 
71 


.. WENT.·-----------·-·-······· 153 
NINE _____________ ·--·-······ 153 


MENT _·------------····-···-· 111 
EENT________________________ 102 
VENT________________________ 
7 0 


HUND________________________ 
64 


CENT________________________ 
52 


TI ON________________________ 218 
STOP________________________ 154 
RIOD________________________ 
63 
FROM________________________ 
5 9 


I 
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TABLE 11-D, Concluded.-The 54 tetragraphs appearing 50 or more times in the 50,000 letters of 
Government plain-text telegrams arranged first alphabetically according to their third letters and 
then according to their absolute frequencies 


REQU________________________ 
9 8 


THRE________________________ 14 9 
HIRT ____________ ----------- 
97 
NORE------------------------ 
77 


LARS________________________ 
68 


PERI________________________ 
6 7 


OURT________________________ 
56 


DASH________________________ 13 2 
UEST________________________ 
8 7 


ENTY________________________ 161 


ENTH..·-------------------~-- 114 
ENTS________________________ 
62 


IRTY________________________ 
5 9 


FOUR.._______________________ 144 


EQUE---------------··-------- 
8 6 
NAUG________________________ 
5 6 


FIVE________________________ 13 5 
SEVE________________________ 121 


TABLE 11-E.-The 54 tetragraphs appearing 50 or more times in the 50,000 letters of Government 
plain-text telegrams arranged.first alphabetically according to their final letters and then according 
to their absolute frequencies 


OMMA..·----·----------------- 
71 
DASH. _______________________ 132 
QUES ________________________ 
87 
OLLA.. _______________________ 
70 
EIGH...---------------------- 132 
THIS ________________________ 
68 
ENTH.. _______________________ 114 
LARS ________________________ 
68 


AUGH.. __ ·-------------------- 
52 
ENT$ ________________________ 
62 
FFI C----------------------·- 
62 
PERI ________________________ 
67 
WENT------------------------ 153 
HUND------------------------ 
64 
DOLL_ _______________________ 
68 
IGHT ________________________ 140 


DRED ________________________ 
63 
MENT ______________ -------___ 111 
RI 0 D ___ --------------------- 
63 
COMM.. _______________________ 
93 
EENT-----------------------· 102 


IVED ________________________ 
62 
FROM.. _______________________ 
59 
HIRT ________________________ 
97 
UEST ________________________ 
87 
TI ON _____________ ----·--··-- 218 
VENT ________________________ 
70 


N~-"--------------------- 153 
EVEN------------------------ 168 
ASHT ________________________ 
64 
THRE ________________________ 149 
TEEN ________________________ 163 
UGHT.·-·-----·-------------- 
56 
FIVE.. _______________________ 135 
TWEN ________________________ 152 
OURT ___ . -------------------- 
56 
HREE.. _______________________ 134 
STAT ________________________ 
54 
SEVE------------------------ 121 
ERI O _______ ·---------------- 
66 
CENT------------------------ 
52 


EQUE.. _______________________ 
86 
NORE.. _______________________ 
77 
STOP ________________________ 154 


RTEE.·---------------------- 
59 
REQU ________________________ 
98 
FI CE...---------------------- 
50 
FOUR.. _______________________ 144 
THIR _______________________ 104 
LLAR----------------------- 
71 
ENTY .......... -------------- 161 
NAUG ________________________ 
56 
UNDR .....................•• 
59 
IRTY _____________ ----------. 
59 


~· 
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TABLE 12.-Average and mean lengths of words 


Number of 
Number of 
Number of 
letters in 
times word 
letters 
word 
appears 


1 
378 
378 
2 
973 
1,946 
3 
1,307 
3,921 
4 
1,635 
6,540 


5 
1,410 
7,050 


6 
1, 143 
6,858 
7 
1,009 
7,063 


8 
717 
5,736 
9 
476 
4,284 


10 
274 
2,740 


11 
161 
1, 771 
12 
86 
1,032 


13 
23 
299 
14 
23 
322 
15 
4 
60 


120 
9,619 
50,000 


oJtf'. J.;,. 


(1) Mean length of •e!ll!888&..------------------------------------------------------------------------------------ 5.2 Letters. 
(2) Average length of messages---------------------------------------------------------------------------------- 217 Letters. 
(3) Mean length of meBBages.------------------------------------------------------------------------------------ 191 Let~ers. 
( 4) Mode (most frequent) length.------------------------------------------------------------------------------- 105-114 Letters. 
(5) It is extremely unusual to find 5 consecutive letters without at least one vowel. 
(6) The average number of letters between vowels is 2. 
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Deciphering __________________________________________________________________________ 3lc ___________________________ 52. 
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Dictionary words used as code words----------------------------------------------- 47b. __________________________ 99. 
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Distribution: 
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65. 
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Frequency distribution __________________________________________________________________ 9, 17, 19, 26e, 44c ________ 11-13, 27-28, 
31-SS, 42, 
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General system, determination of. ___________________________________________________ 4a, 6; 13, 50 ______________ 7, 8-9, 18-22, 


103-104. 
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Grilles •••••. ________ ----------------------------------------· --------------------------------- 4 7 c ______________ ------------- 99. 
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Language employed in a cryptogram------------------------------------------------ 4a, 5_________________________ 7, 8. 
Language frequency characteristics------------------------------------------·------- 9d, 25 _______________________ 12, 41. 
Language peculiarities •• ------------------------·----------------------------------------- 5b.--------------------------- s. 
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Accented.----------------------------------------------------------------------------- 5b ____________________________ 8. 


Low~frequency ..... ------------------------------------------------------------------ 31c ...••• --------------------- 52. 
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Medium-frequency consonants________________________________________________________ 13d__________________________ 19. 


Messages: 
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General phraseology ___________ ----------------------------------------------------- 49a__________________________ 101. 
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Missing letters. .•••• ----------------------------------------------------------------------- 5b, 14e______________________ 8, 24. 
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Mixed sequence.---------------·--------------------------------------------------------- 2ld __________________________ 39. 
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Monoalphabet distinguished from polyalphabeL ________________________________ 12, 14.---------------------- 18, 22-25. 
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Normal distribution .. -------------------------------------------------------------------- l 7b, C------------------------ 27, 28. 
Normal frequenCY------------------------------------------------------------------------ 9, 11, 25-------------------- 11-13, 16-17, 
41. 
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Phraseology of messages ... -------------------------------------------------------------- 49a ..• ----------------------- 101. 
Plain component, completion of.----------------------------------------------------- 20a __________________________ 34. 
Plain-text unit .. --------------------------------------------------------------------------- 41C--------------------------- 70. 
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Random text, number of blanks·------·---------------------------------------------- 14/--------------------------- 24. 
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17, 23. 


Repetitions--------------------------------------------------------------------------------- 13g, 24b, c, 27 ___________ 21, 40, 41, 43- 


46. 
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Of digraphs and trigraphs .•..•• -------------------------------------------------- 211--------------------------- 46. 
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Mixed---------------------------------------------------------------------------------- 2ld __________________________ 39. 
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148274-88--10 


I 


'l 


I 


I' 


i42 


Paragraphs 
Pagea 
Solutions of a subjective nature------------------------------------------------------- 3----------------------------- 5. 
Specific keY-------------------------------------------------------------------------------- 4, 7, 19a, 3lb______________ 7, 10, 31, 52. 
Standard alphabets----------------------------------------------------------------------- 15a, 16, 20b, 38e _________ 25, 26, 36, 65. 
Subjective solutions---------------------------------------------------------------------- 3----------------------------- 5. 
Substitution: 
Bili teraL------------------------------------------------------------------------------- 41---------------------------- 70-71. 
Digraphic----------------------------------------------------------------------------- 41c, 42a ____________________ 70, 72. 
Distinguished from transposition--------------------------------------------- 12, 13----------------------- 18, 18--22. 
Polygraphic------------------------------------------------------------------------- 41C--------------------------- 70. 
M ultili tersJ ___ -------------------------------------------------------------------- 4lc___________________________ 70. 
Trigra phic ____ ---------------------------------------------------------------------- 41C------------------------ 70. 
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